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PLENARY SESSION 

io THE METALANG UAGE OF MACHINE TRANSLATION 
ANI) ifS APPLICATION " “ 


N. Do Andreyer (Leningrad) 

1* We call a metalanguage any linear system of signs used for the 
•written designation of the elements in a particular system of ideas and the 
relations between these elements 0 

2o The class of metalanguages at the present tin® comprises mathematics, 
physics, chemistry, formal genetics, and symbolic logic. 

So The preparation of algorithms for machine translation requires the 
development of a special metalanguage in the symbols of which my be described 
the facts and relationships of the language systems that are subject to equiv- 
alent comparison 0 

4« The symbols used in the metalanguage of machine translation ar© 
regarded as metalanguage words and grouped in categories analogous to th© 
parts of speech „ 

commands in M 0 T 0 y^nachine translation? are regarded a® 
metamoods /META-NAKIfl NENIYX7 

6, The use of metalanguage in the analytic part of algorithm® 0 

7 0 The use of metalanguage in the transformational part of algorithms . 

8, The use of metalanguage in the synthetic part of algorithms 0 

9o The possibility and value of a general theory of metalinguistic 

systems, 

10o A comparative analysis of the class of metalanguages and the class 
of spoken languages may serve as a basis for elucidating the relations between 
formal logical semeiotics and general linguistics, 

2 « SOME GENERAL PROBLEMS IN MACHINE TRANSLATION /m„T,7 

I, Ko Bel'skaya (Moscow) 

1. Experience gained in preparing experimental routines for machine 
translation from English, German, Chinese, Japanese, and Russian in the XTM 
and VT /Inatitut tochnoi mekhanikl i vychislitel'noi tekniki/lnstitute of 
Precision Mechanics and Computer Engineering/of the Academy of Sciences, 

USSR confirms the assumption that translation, even in such an unusual form 
as machine translation, is, as far as content is concerned, a linguistic 
problem. 


— 1 ™ 
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2* The development of linguistic methods of solving MoT* problems may- 
be achieved on the basis of so-called "traditional linguistics" and the 
results of such work may be of definite interest to linguistics* 

The systematization of language phenomena that accompanies M*T* research 
should help to eliminate the well known contradictions and diffuseness in 
the definitions of oertain linguistic categories accepted at the present time* 

3* A distinction between the lexical and grammatical aspects of the 
translation problem seems essential* The difference in quality and degree 
of lexical and grammatical abstraction emerges in the system of machine 
translation with unusual clarity* 

Rules of lexical character are recorded in a glossary* Grammatical 
rules are not included in the glossary and form the content of so-called 
"translation routines"* 

4. An M.T* glossary must be so constructed that its various parts can 
expand unevenly* 

An M*T* glossary may be divided into 2 main sections* 

I single -meaning glossary, and 

II multiple -meaning glossary* 

Each of these is in turn subdivided into; 

la glossary of technical terms j 

lb glossary of words in general uses 
Ha glossary of full-meaning words $ 
lib glossary of auxiliary words* 

An MoT. glossary is accompanied by several auxiliary routines (Coi ft- 
prising' one cycle in the translation routine) in order that the lexical 
analysis of a sentence may be performed without human intervention* 

1* Routine of dividing a sentence into words Routine 1 is not 
essential for all languages, only for such as Chinese, Japanese, Arabic, 
etc *, where the sentence is written down in the form of an unbroken succession 
of signs with no spaces between the word^ 

2* Routine of obtaining the glossary form of a word 

3* Grammatical analysis of "unknown words" 


O 2 «5 
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4. Syntactic analysis of "formulas" 

5. Routine of distinguishing homonyms 

6. Routine of analysis of polysemy. 

5. The basic problems of an M.T. glossary— size and polysemy— are 
satisfactorily solved by combining the following two methods* 

(a) division of the glossary into a series of "special glossaries* 
corresponding to various spheres of human activity (in our case - correspond- 
ing to the various branohes of science); 

(b) contextual (functional - semantic) analysis of the words. 

6. The main features of an M.T. glossary are that it* 

(a) oontains a systematized description of each word that is 
capable of ensuring the subsequent grammatical analysis of the word in the 
sentence (the "invariant characteristics of the word J; 

(b) provides for a genuine correspondence between two lexical 
systems, registering the "relevant meanings" of words; 

(o) takes cognizance of "zero meanings" of words, i.e. instances 
where a word must not be translated into another language as a separate 
lexical unit. 

For the rest, an M.T. glossary may be arranged on the same principles 
as those underlying existing bilingual dictionaries. In particular, there 
is no need to convert an M.T. glossary into a "glossary of stems . More- 
over, a glossary of words has definite advantages for M.T. too. 

7. The solution of the problem of grammatical analysis in M.T. is 
connected with the realization of a logical, structural d ®® cri ^ io “ 
language. Henoe, conclusions drawn from solving this problem may have 
oertain general linguistic interest. 

8. Following the grammatical analysis of 5 linguistic systems -English, 
German, Chinese, Japanese, and Russian— for M.T., it seemed possible to use 

a consistent system of dividing words into the following 9 lexico-grammatical 

categories* 

1 . verbs , 

2. substantives, 

3. numerals. 


- 3 - 
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4 0 adjectives, 

5o adverbs , 

6° prepositions ^Chinese and Japanese postpositions may be clas- 
sified as prepositions on the basis of their resemblance to 
prepositions in functiosj/ 

7o oon junctions , 

8. particles, 

9 0 parenthetic words 0 

The principle of dividing words into these classes is similar to that 
underlying the division of words into parts of speech. Hence, there is no 
need to do away with the traditional names of the parts of speech. Only a 
bit more precision is required, * 

_ „ Thus * the phases of numerals, adjectives, and adverbs have been change < 

dif?eI5%or e ^ ? in a separate elas ^ but ^e pronominal category 

differs for such parts of speech as substantives, adjectives, and adverbs, 

. ! y 5 t ! m ^% ti0n J . 0f grammatiaal oa tegories within each part of speech 

by tha 

Analysis of sentence to be translated, and 
Synthesis of translated sentence. 

We call analysis routines that system of rules whereby the linguistic 
analysis of a sentence to be translated can be performed in such a lay as 
to produce the information needed for the grammatical structure of the 
translated sentence, 

_ In M « T « variant developed at the Institute of Precision Mechanics 

l^tiS U ^rl E ! S: l II er J a ?i 0f 4 the Aoaden ^ of Sciences, USSR, the analysis 
routines include the following 8 routines in cycle II g 

1, functional analysis of punctuation marks g 

2, breakdown of sentences into clauses and more precise def ini tion 
of parenthetical phrases in clauses | 

3o syntactic analysis of clauses n 


m 4 a 
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4. “verb” routine ; 

6. "numeral" routine; 

6. "substantive" routine; 

7. "adjective" routine; 

8* "changing word order in translated sentence" routine. 

10. We call synthesis routines that system of rules whereby the 
grammatical structure of the translated clause can be formed. 

As of now 4 synthesis routines for the Russian sentence have been 
worked outi 

1. word-forming routine; 

2. "verb" routine; 

3. "adjective" routine; 

4. "substantive" routine. 

It is proposed to develop a routine for editing the style of translated 
Russian sentences as well as synthesis routines for several other languages, 
particularly Chinese and English. 

This would make it possible to produce multilingual maohine translation 
(from many languages into many languages), using Russian, it is suggested, 
as an intermediary language. 

3. AN INTERMEDIARY LANGUAGE AND ARTIFICIAL 

rnmumn; wmam 


Ye. A. Bokarev (Moscow) 

1. Creation of an intermediary language for machine translation or 
an artificial Esperanto-type international language requires the solution 

of several problems, the main one being the need to establish correspondences 
between the lexioal and grammatical units of languages that differ in their 
structural characteristics, 

2. International languages based on natural languages use everything 
that is essential for communication and reject what is non-essential or of 
little value (exceptions of various kinds, polytypic declensions and con- 
jugations, eto.). The most consistent in this respect are the autonomastio 
languages (Esperanto and Ido). Languages of another kind - the naturalistic 
(Inter lingua and Occidental) - retain certain of the unjustified complications 
and inconsistencies of natural languages. 

- 5 - 
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3. The most important problems in the field of grammar ares indication 
of the parts of speech,, expression of subject-object relations,, and word 
order in sentences „ 

4 0 In the field of word formation there is the problem of productivity 
of word-forming affixes and use of established patterns. 

5o Some of these problem® may be solved in various ways when an inter- 
mediary language or an artificial language for international relations is 
created* Nevertheless „ there are many problems that can be solved in similar 
fashion 0 

4* THE VALUE OF MATHEMATICAL METHODS IN LINGUISTICS 

R e Lo Dobrushin (Moscow) 

1. Uses of linguistics as a justification for its existence. Classical 
fields of uses teaching of languages and application to problems in history, 

2. Demands an language research imposed by classical fields of appli- 
cation of linguistics® 

So Newest fields of application of linguistics s mechanical translation 
and use for transmission of information in the form of written and oral 
linguistic material, 

4 e Problems and methods of linguistic research dictated by the newest 
fields of linguistic applications 0 

5. Mathematical methods of linguistic investigations 

(a) methods used in theory of numbers applied to investigation 
of the grammatical structure of language j 

(b) investigation of language structure by methods used in the 
theory of information^ 

(©) linguistic statistics. 

6. Interrelations between classical and modern linguistic techniques. 
Potential for the development of mathematical methods. 

5. CONVERSION OF COMMUNICATIONS AND CONVERSION OF POPES 

V. V. Ivanov (Moscow) 

1. In theoretical investigations dealing; with automatization of 
linguistic processes s it is advisable to distinguish the conversion of com- 
munications (texts) from the conversion of codes (sign systems). 


c=> 
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2. By communication conversion we understand the translation of a 
communication from one code into another (recoding) while retaining the 
invariant information. When speech is transmitted at a distanoe* the 
linguistic structure of the text is kept , which makes this case very simple. 
When sentences are converted within a single language the linguistic 
structure of the text is partially transformed. This transformation may, 
therefore, be regarded as a first approach to machine translation. In 
translating from one concrete language into another concrete language or 
into an intermediary language, it is possible to preserve the characteristics 
of the linguistic structure of the text, which are direotly reflected on the 
structure of the text in the other language. In translating into the logical 
abstract language of an information machine, only the logical structure of 
the text can be preserved. The increasing degree of difficulty of each of 
these tasks is determined by the complexity of the rules for converting a 
communication, which vary with the extent to which the information appear- 
ing as an invariant during the conversions can be formalized. 

3. By oode conversion we understand the translation of one oode into 
another while retaining the code pattern. An intermediary language for 
machine translation and an abstract machine language for an information 
machine may be regarded as abstract systems, which are represented by the 
concrete language of scientific and technical texts. Therefore, to develop 
these abstract systems we require a formal analysis of the individual con- 
crete languages in order to reveal their common patterns. An abstract 
maohine language may be constructed by converting concrete languages derived, 
in turn, from interpreting an abstract language. The general theory of oode 
conversion may be used for the deductive derivation of one scientific system 
from another. In this connection it is necessary to investigate code 
isomorphism in the various sciences (and code isomorphism in a single scienoe 
at various stages in its history). At the same time a general theory of 
code conversion makes it possible to formulate with greater preoision the 
concepts of comparative and historical linguistics due to the fact that com- 
parative-historical calculation is a special case of code calculation. 

6. THE SEQUENCE -IN BUILDING A LANGUAGE SYSTEM 

P. S. Kuznetsov (Moscow) 

1. Any language is a system of simple units of various orders so 
interlinked by hierarchical relations that each elemental unit is in some 
respect indivisible (without loss of some of its properties) and at the 
same time consists of a certain number of units of a lower order. 

I 

2, The s imp le units of one order form what is called a level, stage, 
or layer in a language system. Thus, one level is formed by suoh elemental 
units as phonemes, another by morphemes, which consist of phonemes, a third 
by lexemes (words), which consist of morphemes, etc. 


- 7 - 
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When we build any language system* apparently the simplest way should 
be to define in succession the units of the lowest order and then pass on 
to the units of the next higher order, the units and relations in which 
they must be defined in aooordanoe with concepts already defined for the 
next lower order* Thus, having defined the concept of phone ms , we may de- 
fine the morpheme, which always consists of a certain number of phonemes* 

4* But if we proceed in this fashion, we shall not be able to con- 
struct an internally consistent system, since at certain stages along the 
way we will meet up with vicious circles (in the logical sense)* 

5, The reason is that a system of units in any single order requires 
certain concepts lying outside itself for its own construction or, in other 
words, forming with respect to it meta-concepts ^^TA-PONYAT IYA7* These 
meta-concepts relate in part to the system of units in a lower order (with 
respect to the order in question) and they may relate in part also to the 
system of units in a higher order (with respect to the order in question)* 
Thus, the definition of phonemes and their interrelations (in the phonological 
sense, to which I subscribe ? I have often set forth in print the case for 
this view, so there is no need for me to go into it again here) are based 
not only on concepts from the field of phonetics, but also on some concepts 
from the field of morphology, i*e 0 , they relate to the level of morphemes* 

6 • A more complicated method of constructing a language system is 
outlined on the basis of the foregoing. In son© oases it is necessary to 
proceed directly from the system of the lower (e,g„ first) order not to the 
next higher (in the given case, seoond) order, but to the following (in 
our case, third) order? and having constructed it without utilizing the con- 
cepts of the second order, to proceed to this last? and then to return to 
the system of the third order and finish constructing it, now also maTH ng 
use of the concepts relating to the system of the second order* 

7* MACHINE TRANSLATION STUDIES IN THE MATHEMATICAL 
INSTITUTE OF THE ACADEMY OF SCIENCE'S, USSR " 

A. A. Iyapunov and 0* S* Kulagina (Moscow) 

I « In troduction 

1. Electronic computers are a highly efficient means of processing 
information* 

2® It is praotioal to use electronic computers as an auxiliary tool 
for intellectual work. 

3. Human speeoh as a means of transmitting information* 

4* The importance of making it possible for machines to use human 
speech* 


«* 8 ■» 
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go Machine translation as a first step in instructing machines to 
work with a language 0 


II o Brief Description of Work Done 


60 French ™R u 3 s i an translation® Empirical formulation of rules# 
Construction of an algorithm suited to the machine's capabilities. 
Elaboration of problems connected with coding and information conversion 
in the machine memory and with the organization of programs to increase the 
efficiency of machine operation® Utilisation of scales® Work on improv- 
ing the algorithm and programs on the basis of experimental translations. 


7. English^Russian translation® Use of structural<=syntactic analysis 
of English® Classification of English and Russian words on the basis of 
formal criteria® Grammatical configurations of English and Russian* a 
comparison® Problems in eliminating homonomy® Use of experience with 
French-Rus a ian translation in problems connected with coding, program con- 
struction, and Russian sentence analysis® 


8 . Problems in automatizing translation programming® Operational 
description of translation algorithms® Compiling program, constructing the 
translating program according to its operational description. Significance 
of experience gained in programming Frenoh-Russian translation® 


9o Theory of n umb ers approach to the construction of a formal grammar® 
Classification of words, identification of configurations, determination of 
relations between words® Possibilities of using a similar approach to 
syntax and phonetics ® 


10. Basic principles of operations advance by “'ledges ^USTUPAMI^s 
maximal theoretical interpretation of each steps planning of work based on 
interrelations between machine and thought i close contact between groups 
working on different languages $ joint work of mathematicians and Unguis s 
at all stages starting with the formulation of translation rules. 



11® Linguistic problems in machine translation® 

(a) Development of precise system of linguistic concepts, their 
operation in translation algorithms as a criterion of usefulness® 


(b) Development of n*thods of constructing translation algorithms 
for different languages® 

(c) Intermediary languages, construction and use® 

(d) Problems in linguistic statistics® 


algorithms® 


Investigation of language structure on the basis of translation 
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12 o Technical problems in machine translation,, 

(a) Elaboration of effective designs for translation machines 0 

(b) Establishment of operational systems for these machines . 

^ Elaboration of special memory devices (large capacity with 
swift retrieval) o * * 

(d) Design of special input and output devices ® 

13 o Mathematical problems in machine translation® 

(a) Development of effective means of coding information at the 
various stages of operation® 

(b) Increasing the output of algorithms® 

(°) Investigation of abstract language models and translation 


i ^ Elaboration of a mathematical language to describe translation 

uhins o 


(e) Automatization of programming of translation algorithms® 

14 0 Combined-cybernetic problems® 

(a) Machine output of algorithms® 

(b) Machine production of linguistic statistics® 

(c) Machine construction of models of concrete languages on the 
basis of limited text materials® 


IV® Problems Connected with Wo rk in the 
ftield of Machine Translation 


15. Need to elaborate different approaches to the problem by different 
research groups maintaining close contact among themselves® Value of co- 
operation in machine translation® Need to establish systematic exchange of 
information between groups working in different cities® 


16® Need for representatives of the various fields of specialization 
to participate in the work on machine translation* mathematicians . linguists, 
and engineers constantly cooperating at all stages of the work from formulatio 
of rules to study of experimental translations® 


10 
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8. AN INTERMEDIARY LANGUAGE MDDEL FOP, ffiail TRANSLATION 


lo A» Ifel°ehuk (Moscow) 

The following represents one of the possible solutions to the problem 
of machine translation from many languages into many languages g 

1c, Two sets of rules are worked out for each language? 

(a) The r ules of analysis which., with the help of appropriate 
glossaries anJ eHaHI , effect the transfer of a text into 
a conventional numerical, cod® in such a way that each word 
in a ^given form and given syntactic function is matched 
one-for^on© with a chain of figures ©ailed set of infer™ 
mat ion for the word* The series of sets, of information 
developed is^ broken down into paired typical combinations 
with which the relations existing in each given pair of 
information sets have been matched one^f or~one 0 The fixed 
relation between the two sets of information (containing 
the syntactic relation between the corresponding words) is 
called a "configuration" ,, On© member of the pair which 
satisfies the given configuration is called the "governing* 
and the other the "governed* member* The total number of 
configurations is not very large (in a specialised text- 
no more than 200) « 

As a result of the analysis, each word in the text 
to be translated is replaced by a set of information and 
each set contains an indication of what configuration it 
satisfies. and which member it ia 0 

(b) The rules of synthesis permit transition from the numerical 
cod© , io@o , from the series of sets of information to 
words, to the actual text* This operation is the reverse 
of analysis described above 0 


Each configuration contains an indication of what 
form a word that satisfies the configuration in question 
as either member of the pair must have* Therefore, if 
w© know the stem of a word, the kind of configuration, 
and exactly how the word satisfies it, we can synthesis® 
the necessary forms 0 


Both analysis and synthesis are effected in com™ 
plete independence of the translation* 

^ special system of rules and charts is being worked outs 
etermining correspondences between the conventional numerical cede of 
different languages (identical correspondences are not essential^ rules for 
choice may be used 0 \ These correspondences ar® established on 3 levels? 

- 11 - 
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(a) lexical correspondences (ioe. lexical transfer of stems), 

(b) grammatical correspondences (transfer of so-called extra- 
syntactic " categories as, for example, number in nouns 

or tense and mood in verbs )j 

(o) syntactic correspondences (correspondences between con- 
figurations - syntactic relations of different languages 
as well as correspondences between groups of configurations 
clauses and various types of phrases ). 

This abstract system of correspondences is also called 
an intermediary language which does not exist, therefore, 
as any real or artificial language but represents a unique 

calculus o 

3. The translation process consists of three steps® 


analysis — transition from a text in the source language to a 
series of configurations? 

transition — from a series of configurations in the source 
language to a series of configurations in the target language; 


synthesis — transition from a series of configurations in 
the target language to a genuine text in it. 


4, Underlying the translation is a-syntactio analysis® establish- 
ment of configurations, i.e., ascertaining the relations between words in 
the source language and expressing these relations by the most suitable mean 
in the target language. Such morphologioal data as case, number, and person 
of a verb (also the use of auxiliary words is provisionally included here; 
are used only as aids while ascertaining the syntactic relations. 


5» During the course of syntactic analysis both the functions 
of words in the sentence ("sentence members") and the interdependence of 
words are established. The latter factor is especially important, since 
the interdependence of words makes it possible during synthesis to regulate 
their arrangement, i.e. to achieve the best word order o 


6. The model of an intermediary language that has been worked 
out for machine translation includes for the present Russian, English, 

Chinese, French, and Hungarian. The purpose is to develop a system of 
formulating rules and the best method of recording and arranging the material. 


■<* X3 *=» 
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9. TBE SIGNIFICANCE OF MACHINE TRANSLATION 
FOR LINGUISTICS 

M. I* Steblin-Karaanskii (Leningrad) 


Besides promoting oooperation with representatives of the precise 
sciences and thereby instilling linguists with the need for greater accuracy 
in their research and formulations, work on machine translation is important 
for linguistics in three respects: 

(1) It is critical of all the traditional grammatical concepts, 
primarily those like the ’"parts of speech", "members of a clause”, "clause", 
etc. Based, as it is, on practical considerations, this criticism will be 
more objective and effective than purely theoretical criticism. 

(2) It makes dear that the same linguistic fact may be described 
in various ways depending on what general definitions or terminological 
conventions are used, with the result that all the dogmas established in 
the individual branches of linguistics need to be reviewed. 

(3) It will aid in overcoming linguistic "semantism" _^3BMAIJT IZWh ? , 

i.e. the practice whereby linguists follow the line of least resistance and 
study meanings, not the structure of language. Language differs from other 
sign systems not by the existence of meanings (which are not peculiar to 
language), but by the structure of expression. 

10. THE "ACTIVE" AMD "PASSIVE" GRAMMAR OF L. V. SHC HERBA 
AND THE PROBLEMS OF MACHINE TRANSLATION 


I. I. Revzin (Moscow) 

1. The polysemantic term "grammar" (either "grammatical structure of 
a language” or "description of the grammatical structure of a language") 
is one cause of the erroneous conception that a given language has only a 
single grammatical structure, that there is only one correct "grammar" (as 
a description of a system). 

2. The description of a language system depends on the goal that an 
investigator seta for himself. This notion was the core of the remarkable 
theory of L. V. Shcherba on "passive" and "active" grammar, which has 
suffered undeserved oblivion. 

3. "Passive grammar studies the functions and meanings of structural 
elements in a language on the basis of their forms, i.e, the external side. 
Active grammar teaches the use of these forms." (L. V. Shcherba) The 
purpose of instruction in passive grammar is to teach one to understand a 
text in the language. The purpose of instruction in active grammar is to 
teach one to express thoughts in the language. 


- 13 - 

Approved For Release 2000/08/24 : CIA-RDP68-00069A0001 00200007-9 



Approved For Releas^b00/08/24 : CIA-RDP68-00069A00010f®l0007-9 


4 0 One of the dangers pointed out in connection with L 0 V 0 Sheherba's 
ideas is the assumption of a '’denudation of ’thought” or "existence of 
thought without language” in passing from form to pure meaning and from pure 
meaning to form 0 However , no cognizance was taken, of the fact that a 
thought need not be registered in a concrete language g it may be registered 
in an abstracts, artificial language where there is a simple , reciprocal 
correspondence between the designator and the thing designatedo 

5. Machine translation assumes precisely such an abstract languages, 
namely an intermediary language that must be implicitly present in any 
machine program and will apparently be described in the near 1 future* If 
cybernetic analogies are adequately grounded, on© my assume that the ana- 
logue of Such an intermediary language is present in any translation (and, 
generally, in any form of logical activity)* 

6 0 Machine translation has demonstrated the correctness and need of 
a ’separate approach to the problem of text analysis ("passive” grammar in 
L* Vo Shcherba 3 a terminology) and to the problem of text synthesis ("active” 
grammar ) » 

7 o The first problem was effectively solved by purely formal means o 
The limits of machine translation depend on a full solution of the second 
problem (the compilation of a list of synonyms! - by synonomy we understand 
the presence of several units corresponding to a single unit in an abstract 
language or what amounts to the same thing, a single unit of thought =>» and 
an algorithm for retrieving an equivalent under the given logical conditions) 

8® Experience with machine translation has shown that, generally speak- 
ing, an inverse ratio la observable between the "active” and "passive" 
grammar of a languages the more complex the "passive" grammar, the simpler 
the ”active” 1> and vice versa 0 Hence, for a number of languages emphasis 
wholly on passive grammar might considerably alleviate the language curricula 
in schools. 

9o L® Vo Shcherba °s ideas on the distinction between active and passive 
grammar, as strengthened and enriched by experience with machine translation, 
must ultimately find application in foreign language teaching (in secondary 
schools as well as in colleges and universities ) 0 

10o Secondary schools should make wide use of the methods ©f passive 
grammar, which are not only unusually effective for analyzing an unfamiliar 
text, but correspond to the habits of logical thinking developed in 
mathematics classes 0 Moreover, interest in learning the grammar of a 
foreign language can be heightened by introducing exercises in translating 
sentences ”by machine”. This would also serve the interests of polyteehnieal 
instruction 


- 14 


Approved For Release 2000/08/24 : CIA-RDP68-00069A0001 00200007-9 



Approved For ReleaseJ000/08/24 : CIA-RDP68-00069A000100a»0007-9 


11. The sane considerations apply as well to language teaching in 
the natural science departments of universities and in the higher technical 
institutions where little use of the well developed formal-logical habits 
of students has been made up to now in foreign language teaching. 


12. Creating a scientific theory of "active grammar" would not only 
push forward the frontiers of machine translation, but assist instruction 
in language schools where grammar is still taught in undifferentiated 
fashion. This is of particular concern to translation departments where 
necessity dictated the conversion of a theory of translation into a theory 
of active grammar. 


11. A GENERAL THEORY OF TRANS LATION IN CONNECTION 

gHjg Mgmg TRANSLATION ~ 

V. Yu. Rczentsveig and I. V. Revain (Moscow) 


1. The possibility of creating a scientific theory of translation is 
still being argued by a number of specialists , both linguists and literary 
critics. Nor has there been any final answer to the question of whether a 
theory of translation concerns scientific linguistics or belongs to the 
field of literature . 

2. The polysemantic term "translation" also awaits a definition. The 
historioal paramountcy of artistic translation has resulted in the conceiv- 
ing of every translation as an artistic production, as a creative achieve- 
ment in the realm of language. Meanwhile, the development of new types 

of translation activity, chiefly in the field of scientific and technical 
literature, has made another conception of translation urgent, i.e. as a 
process of establishing principles of correspondence between the structures 
of two languages. 

3. Disclosure of the possibility of translating texts by a machine 

and development of a theory of machine translation has shown that distinguish, 
ing between the fields of translation makes limitation of both concepts 
logically inexorable* 

(a) "trans lationj" is translation as a form of creative activity 

stud. 

(b) "translation* 9 is translation as the establishment of strict 
correspondences o 

Translation as a form of creative activity is an object of study for 
theorists of literature. Translation as the establishment of strict cor- 
respondences is an object of study for linguists. 
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So A linguistic theory of translation must regard translation ("trans- 
lation 2 n ) as a special kind of decoding with subsequent encoding into another 
system of symbols 0 The distinctive feature of this transformation of in- 
formation is in the irreversibility of the process. The reason is that 
simple, reciprocal correspondences between language systems are lacking. 

Hence, rules for correspondence in translation are complicated by the need 
to formulate a number of restrictive conditions. Determination of these 
conditions is a proper object for a linguistic theory of translation. A 
general linguistic theory of translation studies ideal types and routines 
for matching systems of language symbols; a particular theory of trans- 
lation analyzes the correspondences between the two languages. A general 
theory of translation is chiefly a deductive discipline, while a particular 
theory of translation is inductive. 

6. Thus, the methodology of a linguistic theory of translation com- 
prises % 

(a) methods of structural comparative analysis or, in other words, 
analysis of the synchronous stages of various languages; 

(b) methods of linguistic statistics f 

(c) methods of logical semantics, more precisely general seneiology. 

The very listing of these methods shows the main difference between the 
linguistic and literary theories of translation. The latter requires* 

(a) a study of the era; 

(b) world outlook and creative method of the writer and literary 
school; 

(c) peculiarities of his individual artistic style. 

7. From the semantic point of view "translation 2 » is a certain re- 
flection in itself (a system of elemental meanings is assumed to be invariant). 

Translation x from this point of view, is not a reflection in itself, since 
pragmatic meaning, which plays a major role in "translation does not 
coincide in two languages. 

8. Haying marked off the object and methods of a linguistic theory 
of translation, we can not only ascertain the limits of machine trans- 
lation, but also create a well strtiotured, definitive theory of trans- 
lation, that is to say a separate, scientific linguistic discipline. 

Creation of this discipline can help to perfect methods of training trans- 
lators. It will undoubtedly find application in the teaching of foreign 
languages as well. 
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THEORETICAL SECTION 

12 * SPECTRA OF PHONEMES AND THEIR USE IN 
MACHINE TRANSLATIO N 

V® A® Arte mo v and I® A. Zimnyaya (Moscow) 

lo Oral information and translation machines must, among other things , 
be accessible to people with varying physical characteristics of speech,, 
Therefore , their system of signalling must be based on the phonemic in- 
variants of sounds or , in other words, on the spectra of phonemes® 

2® Three aspects of the spectral analysis of speech sounds must be 
distinguished s (l) syntactic (phonologic), (2) semantic (phonetic), and 
(3) pragmatic (technical-communicative). 

3. A syntactic investigation of spectra of phonemes is based on con- 
trasts within the sound system of a given language ® A semantic investigation 
relates the spectra of phone mas tc word meanings and grammatical forms® 

A pragmatic investigation of the spectra of Speech sounds originates in and 
services practical needs* 

4® A syntactic and semantic investigation of spectra of phonemes pro- 
vides an exhaustive analysis of their physical properties which form 
structures bearing a comparative and systematic character® 

5. A pragmatic investigation of spectra of phonemes requires the 
determination of their minimal characteristics, which permit of their full 
or partial restoration, i.e. it, becomes a compression of the spectra of 
phonemes. A pragmatic investigation of spectra of phonemes becomes their 
compandor , including the compression and expansion of amplitude* 

6* The Laboratory of Experimental Phonetics and Speech Psychology 
(LEF and PR) /Eaboratoriya eksperimental' l noi fonetiki i psikhologii rechi/ 
of the First Moscow State Pedagogical Institute of Foreign Languages 
(MlPIIYa ) /Efeskovskli gosudaratvennyi pedag©gioh®skii institute inostrannyldi 
yasykovT* conducted investigations of th® spectra of 5 vocalic phonemes of 
a, o, a, i„ e type in the following languages* (l) Russian (V® A® Artemov 
and I® A© Zimnyaya.), (2) Georgian (To G* Tsibads®), (3) Armenian^ (Ao M® 
Aranyan and A® A® Khaohatryan) , (4) Lettish (I. A® Zimnyaya), (5) Albanian 
(I® A* Zimnyaya), (6) Bulgarian (I® ko Zimnyaya), (?) Csech ( I* A® Zimnyaya), 
(8) German (L* Po Blokhina and I« A® Zimny&ya), French (K® K® Barashnikova 
and V® So Sokolova), (10) English (I. A. Zimnyaya) ® In addition, data on 
English were drawn from the works of Paget, Green and Potter, Petteraon, 
and Xopp for purposes of comparison with the studies of the LEF and PR® 
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To All the material was recorded with a basic ton® of 120=150 cycles 
per second at a level of 65=70 db 0 The pronunciation of each speaker was 
representative of the literary speech of the various languages „ 

8 0 A comparison of th© guantitative and graphic data shows that the 
following pragmatic rules are observable within each languages 

(a) the a=typ® vowel is characterised by a wide formant region 
(600=1200 cycles) with gradually increasing intensity of 
the components in the direction of high frequencies (1200= 
2500 cycles )„ 

(b) The o»type vowel is characterized by a central formant region 
somewhat shifted down to 400=1000 cycles per second o 

(c) The u»typ@ vowel is characterized by a somewhat narrower 
central formant region shifted still further toward the low 
frequencies of 300=800 cycle® per second with a maximal 
elevation of amplitude in the range of 300=350 cycles per 
second o 

(d) The i=type vowel is characterized by two main formant regions 0 
The first is in the range of lower frequencies and almost 
coincides with the range of maximal intensification in the 
main formant of the u~type vow®l £ as Paget has pointed out» 

But a gentle falling-off is observed in amplitude of the u- 
type a and a steep falling-off in the i~type<> 

(©) Th® e-type vowel is distinguished from the i=type by the 
formants shifted more to th® center o The broader the e s 
the closer the formants com® together* 

9o The above-mentioned acoustical properties of the vowels completely 
correspond to the position and operation of the resonance ohambers of the 
vocal apparatus B as stated in several reports of the IEF and PR as well as 
by Paget and Yakobson c 

10o These studies indicate that the spectra of vowels on the syntactic 
and semantic plane have a structural character 0 Vo A* Artemov suggested a 
means of determining these structures 0 It consists of separating from the 
vowel spectrum all th© areas of relative intensification and establishing 
correlations between them fl taking the lowest of them as 1* 

11 o At the same time a comparison of th© spectra of th® 5 types of 
vowels studied indicates that a structural correlation between the areas 
of intensification is retained within definite limits in the languages in- 
vestigated* In this connection it is possible to speak about a certain 
structural and comparative invariant of these types of vowel spectra in the 
various languages a which Is essential for signalling technique in trans- 
lation machines o 
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13, AN OBJECTIVE INVESTIGATION O F MEANING ASSOCIAT IONS 

Oo S, Vinogradova and A, R, Luriya (Moscow) 

1, An objective investigation of the association of meanings that are 
aroused in man by some word or other is a basic necessity for psychology as 
well as for linguistics 0 

Despite the considerable progress achieved by modern linguistics , in- 
formation theory, and psychological investigation of the development of the 
meaning of words in children, objective research techniques both of potential 
associations aroused by words and of the dynamics of these associations still 
remain to be worked out, 

2, The use of different variations of the conditioned reflex method 
may play a vital rol© in elaborating objective ways of investigating 
meaning associations 0 By combining the showing of a word with some kind 

of involuntary reflex responc® (vascular, cutaneous-galvanic, etc, reaction) 
and then showing other words, the investigator is in a position to establish 
objectively what group of words shown elicit® similar reactions and is 
consequently, to some extent, the equivalent of a previously shown wordj and 
at the same time* he is in a position to trace both the structure and the 
dynamics of these associations, 

3* The report discusses the results obtained from an objective inves- 
tigation of the system of meaning associations by registering the specific 
and non-specific conditions of vascular reactions. Conclusions are drawn 
concerning certain factors that my determine the structure and dynamics 
of thes® associations in normal and abnormal experimental subjects, 

14, THE TREATMENT OF CERTAIN CONCEPTS IN S TR UCTURALISM 

V® T, Origor’yav (Moscow) 

1, Interest, has grown of late in the methods and concepts of the 
structuralist approach in linguistics du# to the development of machine 
translation and other branches of applied linguistic®. However, recent 
articles have treated certain structuralist concepts in an excessively one- 
sided manner and, in essence, incorrectly, 

2« Phonemes are treated as though they were connecting elements lack- 
ing physical roality 0 Th© physical character of th® differential signs of 
phonemes is denied. Real speech sounds are represented as something ex- 
ternal with respect to language. Meanings, which arc also removed from 
language, receive the same treatment. This method of handling speech sounds 
and meanings reflects only the views of L 0 Yelm'slev 8 s group and is not to be 
ascribed to structural: am in. general « 
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So Actually* ths structuralist method of investigating speech sounds 
takes into account their acoustic and artieulative properties „ The func- 
tional criterion used by the structuralists in phonetics makes it possible 
to isolate from the entire diverse mass of phonetic material the physical 
(acoustic and artieulative) properties that carry the functional load and* 
consequently s are of prime importance to the linguist® The functional 
criterion ensures a differentiated (from the viewpoint of language structure) 
approach to the varied and changing properties of phonetic material ® Using 
the functional criterion^ linguists may be very helpful to engineers in 
solving practical problem® confronting the several branches of engineering $ 
contrariwise * orientation on pure relationship elements would prevent the 
linguists from solving practical problems and do away with the possibility 
of cooperation between them and the engineers » 

4® The attitude of the structuralists toward meaning was determined 
by their interest in working out an objective method of investigating 
language o The striving to ©scape from the inadequacies of traditional 
linguistics led the structuralists to refuse in general to consider meaning 
as a solid criterion of linguistic form® However B this refusal to take 
account of meaning in research methodology does not determine the structural- 
ists® theoretical treatment of meaning,. In many cases it exists harmon- 
iously side by side with the acknowledgment of meaning as a basic element 
in the functioning of language® It must be admitted t however 9 that rejection 
of the semantic criterion imposed severe limitations on this school of 
linguistics® In practice a the field of semantics remained outside structural 
analysis® 

5o The meaning of a word is the linguistic form of expressing an idea. 
Meaning cannot be separated from language simply because it does not exist 
prior to or apart from language® At the same time* meaning is a basic factor 
of language B determining its structure® It is important for the further 
development of applied linguistics that objective methods of semantic analysis 
be worked out. Naturally* in solving this problem full use will have to be 
made of the experience gained by the representatives of structuralism in 
their objective investigations of language® 

6® A critical exploitation of the experience of structuralism is 
scientifically advisable® It is an indispensable preliminary stage in the 
task of introducing mathematical research methods into linguistics® 

15® THE SIGNIFICANCE OF FREQUENCY AS A FACTOR IN 

V® M. Grigoryan (Yerevan) 

1® A comparative study of modern Russian-Russian dictionaries reveals 
contradictory data® Thus* in various dictionaries te,g„j Monotypio 4-voluma 
works) on© and the same word may be defined in different ways from the 
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viewpoint of the language * s limitations with respect to stylistic usage ? 
and it is often possible to find inconsistencies in tfa® order in which, 
meanings are arranged 0 These (and other) contradictions make things dif- 
ficult for the reader who seeks information, in order to determine the 
operative norm® for a given linguistic facto 

2 0 Sine® the norms , as a rule , are correlated with tbs factor of 
frequency's statistical data ar® extremely essential In many cases, if 
maximal precision is to be attained 0 Some considerations supported by 
Russian language data (with due regard for strict synchronousnsss) are 
cited by way of illustrating this contention » 

So The plan proposed by us is not original * it agrees in prin- 
ciple with that employed in several frequency dictionaries published a- 
broad (Harry H c Joaselson, The Russian word count,, Detroit, 1953$ 

Victor Garcia HoSo Vooabularic usual voc&bulari© cornua y vocaValario 
fundamental, Madrid, 1953, }, 

They are usually constructed on the basis of the familiar correlation 
between style and. genre » Adopting this plan qj* the whole, we propose to 
set up 4 categories* (l) verse (2) speech in dialogues (3) speech in 
monologue® <=> using material from fiction (4) non-fiction literature =» 
newspapers, document® , @t© 0 It is obvious that statistical data reflecting 
the frequency of usage of a specific word in each ©f the 4 categories must 
be selected on the basis of equal conditions o Clearly, these equal con- 
ditions will be ensured if the frequency of a given word is derived from 
an equal number of word® in all 4 categories o If w® designate the cate- 
gories by a, b, ©, and 4, respectively* the total preliminary number of 
words in category a must be equal to the total preliminary number of words 
in category fe, etc® This word total, it seems to us, ©an be advantageously 
determined by using the Boss method® In, addition, selections must be mad© 
from purely random material (but within the given category)? the more varied 
the material, the more accurate will be the information <, 

The resultant data can fe© used to determine stylistic functions 0 


16. AH EXPERIMENT TO DEFINE THE CONCEPT OF 


Ro lo 


A given finite number of words is examined 0 A finite, ordered ag- 
gregate of words is called a s®at©n©®o Th® division, of all sentences into 
two non- crossing ©las®®® I® assumed to be givens a class of grammatically 
valid sentences and a class of grammatically invalid sentences 0 Word A is 
called subordinate to word to, if a valid sentence containing word A remains 
valid after A is replaced fey feo Two words A and b are ©ailed equivalents, 
if A is subordinate to b and b is subordinate to Ac, All words ar© divided 
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into two non-crossing classes of words equivalent to on© another. Class 
A is subordinate to class B, if all words entered into class A are sub- 
ordinate to words entered into class B« The system of classes and sub- 
ordinations thus obtained - called the basic grammatical structure of 
the language - is examined. The result is a definition of the concept 
of grammatical category. 

17 o THE THEORY OF PROBABILITY AM) DETERMINATION 
OF LINGUISTIC RELATIONSHIP 


Ao B. Dolgopol'skii (Moscow) 

The proposed method of determining the relationship of language families 
by applying the theory of probability is, in broad outline * as follows g 

lo On the basis of linguistio experience,, those semantic points 
are isolated in which maximum historical stability of morphemes (without 
borrowing) is observed,, 

2 0 A determination is made in each group of languages under con- 
sideration as to which morph© mss possessing a given meaning my with greater 
probability be regarded as the older. The usual techniques of comparative 
historical research as well as the method of internal reconstruction are 
used for this purpose. 

3. We cannot speak about phonetic correspondences between language 
families being compared before the fact of relationship has been established. 
Hence, at this initial stage of investigation we must rely wholly on phonetic 
resemblances. Mors precisely, we rely here on subsequent probability cor- 
relations. Following a comparison of cognate languages, it appears that the 
n- sound is the most probable of all the sounds in any single related language 
that correspond etymologically to the n-sound of another related language. 

The same may be said of the m=sound. But, possibly, not of the s-sound. 

At any rate, among all the sounds that correspond in one language to the s, 
g-sounds in another related language, the most probable, apparently, are 
the sounds of the same s, ss- group. This would also seem to be true of the 
1, r-group^ the b, p, f -group, the t, d-group, the k, g, k, h-group, etc. 

In this connection, we perhaps can”t say anything about vowels or laryngeals. 
Starting with these probability considerations, we may be able (leaving 
aside the vowels and laryngeals) to base our subsequent discussions on the 
data of consonant coincidences between various morphemes in the different 
languages. We will term "consonant coincidence* the correspondence between 
consonants that remain within one of the above mentioned groups. These groups 
must be chosen in such a way that phonetic shifts of these sounds are no 
more probable than retention of the sound (retention within the group). 

The groups cited here are obviously only for illustrative purposes. Actually, 
comparative-historical phonetic materials from, all possible language families 
must be used to establish the most probable sound correspondences (one of 
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these correspondences is the most probable for several sounds - the cor® 
respendenc® of a sound to itself )* As a results we may select, let us say, 

10 or 7 different sound units which will constitute the material for 
phonetic comparisons . 

jj 

4 0 Comparing the equivalent rabrphena®® in the different families, 
we n©ts> the phonetic coincidences, (cf para. 5). We then us© appropriate 
formulas from the theory of probability to measure the probability of the 
accidental coincidence of a certain number of morphemes in so many languages, 
from go many comparable items, taking into account the number of old 
synonymous morphemes for each semantic point of each language group as well 
as the total number of consonant® distinguished during the comparisonc 
(Cfo para 0 5). 

If' the probability ©f accidental coincidence proves to lb© quit© low, 
it will be ft weighty argument in favor of th® relationship ©f th© languages 
in question* 

l i 

Use of the theory of probability will enable us to "test th® evidence 
from comparisons between the various language? cited in numerous works 
dealing with th© problems of language family relationship (e 0 g s Trombetti, 
Winkler, etc*)* 

18. A C-ENERAL THEORY OF DEFINITION AND THE 
POSSIBILITY of TfPBpW ts KTTBE 'TKeoKFoF 
~tMns ia| ;!o iTDg?x(TEg —— 

? Ac Ao 2inov°yev (Moscow) 

1® Th© pr©«5© ss of translating from' one language into another may b© 
described a® a language consisting exclusively of definitions* Breakdown 
of the language into elements is here assumed to be effected* It i® 
possible to model the formal appajr^tue of definition®, one may suppose, 
by means of a special device. Having determined all possible definition 
type relations at least between a. selected part of the elements of one 
language and a selected part of tbs elements of another language, we can 
use the modelling device to produo© in standard form at least partial 
translation (if only In initial approximation) • 

I i 

2. A general theory of definitions ! is constructed as part of a 
theory of symbols. Several variant* are! possible depending on the original 
concepts in the statement and on the forte., 1 apparatus for constructing the 
theory. The suggested variant is) characterised by an initial concept 
“Choice/ a special us* ns of defining th® concept “Symbol*, “Term*, and 
“Definition*. The forma] apparatus is constructed on the basis of the 
functors (“Each of*) and (<;» (Anyone and only one of“) and on the ad® 
mission as an initial logical connection of f«cb connection a? could possibly, 
with sea© limitations, be represented o© a fczz&l implication without con- 
"fc* 1 apositiono 
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A general theory cf definitions may contain proof of rules for 
definitions D elicitation of the CQnditiosjs governing their use B rules for 
deduction and their interconnections o 


19° LINGUISTIC PROBLEMS CONNECTED WITH POETRY TRANSLATION 

Vo V, Ivanov (Moscow) 

lo The distinction between 1$ie poetic model of a text and this text 
may serve as a convenient starting point in solving the problem of poetry 
translation Translation makes iij possible to recreate the same poetic 
model by means of another language while retaining the relation between 
‘the model and “fell© “fcexrteo On the dSfrsr the direct conversion of a 

poetic text in one language into a poetic text- in another is impossible . 

2 0 The amount of information,, contained in a text is determined by 
the extent of deviation of this tejct froia the statistical norms of ordinary 
language and from the statistical jaorms of the poetic language of a given 
era 0 A violation of the statistical nornp of ordinary language may become 
the norm of poetic language a which results in decreasing the amount of in® 
formation contained in poetic texts. Poetry translation assumes the 
transmission of the statistical characteristics of a text in conformity 
with the language into which the translation is mad®, 

i ? 

3, The sound structure of ve^s© is determined by the phonological 
structure of the language s as was first pointed out by R, Yakobson, It 
follows from this that transmission, of th® phonetic characteristics of 
the text structure is possible only when the corresponding elements in the 
phonological systems of the two languages coincide, Th© non~translatability 
of a poetic text is to a very large degree determined by the fact that in 
poetic language the plan© of content is functionally connected with the 
plan© of expressions insofar as the plan© of expression is in principle un® 
translatable s the plan® of content appears partially untranslatable. This 
limitation may ^ also apply vo the poetic model of the text# if (as with 
Khlebnikov) units from th© plan© of expression are included in this model <> 


4, Phonetic coincidences of parts of words are used to organize a 
poetic text chiefly in cases where they are superfluous from the morpholog- 
ical point of view. Conversely s morphologically essential phonetic re® 
semblances contain the least amount of information from th® viewpoint of 
poetic organization of speech (of. the problem of verbal rhythm in Russian 
-^ onS ® quently3 the Possibility of transmitting phonetic repeats 
^POvTOROjyde pends not. only @n the phono logical, but also on th® morphologies 
resemblance® between the two languages in question. 


5 o Th® predominance of syrtagmatt© connections between words over 
paradigmatic ©omections is a peculiarity ©f poetic text on th® plane of 
content, W® may m® in this th® result ©f transforming 'language text in 
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accordance with a poetic model (Jths relation between "emotion and text" 

I IEKST£^ to us® Do Mant *1 ' /it iv??, ’ 3 term® ) a This transforms txon 


can be effected not only in th» original but also in the translation* 


. 4 

6 0 A 1 in© -by* line analysis of metre.® dots not yield 
suits be cams it reveals little 'about th© rhythmic structure of long 
passages which are this real units of poetic speech {of* th© definition of 
& period as “ware length" in Mitten's verse 0 as suggested by T* S* Eliot)* 
If a Hn»«by»line translation assays impossible*, then for a translation 
based on a poeti© Model of th® entire text it is considerably More im- 
portant to translate th® Major rhythmic and syntactic units into which 
th® work is divided? as an exaaq&te? Ritte'a translation of* VYKHOZHtJ ODIN 
YA NA COHOSH ft. go out alone onto the road/iis analysed* fhe continuity of 
an invariant text aodel and its ‘inability”' in principals to be formalized 
^^fePORMALIZUYEMJSTJ^sxslud® th© possibility of automatic translation of 
poetry by modern computers 0 

2Q 0 HESEL'3 THKORS^M. AND y.'ffGUISTIO PARADOXES 

To Y* Xwanov 


lo The resemblance between mthei^atios scad linguistic® also applies 
to the trends of the®© sciences as thejr develop in th® 20th century* The 
theoretical foundations of the ^ciences are being investigated in anticipation 
of practical applications? the results -of these investigations will eventually 
prove fit®! for practical purposes* 

f 

2* Hegel’® theorems according to which th© non»contradiotableoe ss of 
a theory cannot be demonstrated within /the formalized theory itself 9 may be 
extended to linguistic theories Iby means of Lots® 8 ® generalisation of the 
i„ which comes down to an affirmation of the incompleteness of any 
* symbol® (including language)* However® it would be essential 
not to restrict oneself to this ^formulation in investigating the founds- 
tions of structural linguistics^ but to examine th® conclusions resulting 
from & linguistic analogue of Hegel's theorem,* 

3* The most severely formalized theories of language that examine 
constructional linguistic objects wear® ‘.developed within the framework of 
distributive analysis s which, assumes th® possibility of describing the 
elements of a language on th© bajsis of /their distribution* It is not 


difficult to show that 


ioR 


ion of this principle leads to 


axes (e*g« i r. the distributive separation of phonemes 0 
word classes « meanings of polysemantic words* etc*)* The distribution of 
elements turn® out to b© iaposs^>l@ 9 if thea© elements w©r© not given 
previously* But th® axiomatic introduction of language elements contradicts 
not only th® principle® of distributive investigation* but also the re- 
quirements of automatic analysis of written and oral speech* The axiomatio 
t a class of regular sentences appear® to 
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4 0 For the reasons given above , it is possible at the present time 

to fashion a formal theory, which can be used to construct a program of 
automatic analysis , only for a maximally simplified approximation to a 
real language 0 We have in mind oases with simple correspondence between 
form and substance? on the plan® of expression - for a system of standard, 
typical variants of phonemes, on the plane of content - for a standard 
language of science., The absence of paradoxes when these cases are analyzed 
does not permit, however, of extending the results obtained to ordinary 
lang uage j the metalanguage for which (unlike the cases mentioned) cannot 
be formalized (this applies both to the phonological and to the semantio 
metalanguage ) o Automatic analysis of real language requires the employment 
of linquistic methods other than those considered above and the use of self- 
teaching type machines (with probability elements )o 

21, METHODS OF BREAKING DOWN A SYNTACTIC WHO IE 

Lo I, Iliya (Moscow) 

lo Linguists representing the most different schools use as a starting 
point in their methods of analysis the possibility — objectively existing 
in any language — of isolating a certain "whole” as a maximal unit that 
can be broken down into similar segments, ioe. comparable in any respect 
whatsoever. This "whole”, which has been variously called “utterance”, 
”sentence”» or "clause”, belongs simultaneously to all the planes or "levels” 
of a language -- phonological, grammatical, and semantic — and is character- 
ised by the fact that its borders coincide in all three pianos, which makes 
this segment a maxima lly complete or basic unit for any decomposition, 

2, The breakdown of a "whole", due to its complexity of structure, 
is done on the basis of criteria that differ for each plane. As a result, 
it yields segments the boundaries of which do not always coincide or, as 
they say, are not "commensurable", Semantio decomposition is to a certain 
extent independent of the grammatical, and it fails to establish a fixed 
correlation between the boundaries created by rhythmic— intonation de- 
composition and the boundaries of morpheme®, words, and groups of words, 

3, Modern linguist® have attempted to eliminate the incommensurability 
of the planes by seeking a single principle common to all stages of analysis 
However, unity of principle is achieved in soma theories by ignoring some 
aspect of language structure (e,go, meaning is excluded in Harris’ method 
and in rhythmic-intonation decomposition of Trager and Smith, while gram- 
matical structure is ignored in Sheherba°s intonation-semantic decomposition) 
Orderlines of method is attained at the price of simplifying linquistic 
analysis, which therefore cannot be regarded as adequate for research in 

all its complexities. However, new methods of analysis focussing on for- 
mal criteria have been used to study them deeply, and modern techniques of 
measuring such language units as phonemes, morphemes, and word® have reached 
a high degree of precision. 
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4o The task of linguistic analysis is not only to isolate the basic 
linguistic, units , but t© determine the relations between the units that 
allow of semantic relations 0 The contemporary school of linguistics ac- 
knowledges as '"structural,* i«e, which deal with linguistic analysis, only 
those relations to which definite form® of expression*, "signals", correspond,. 

Two min trends in th© investigation of syntactic relations ©an be 
discerned at the present times (a) the comparatively recent theory of 
"direct constituents* (Bloomfield, Pike, Wells), which bases sentence decom- 
position on the relations of a logical hierarchy of subordination that 
links all th© parts of a sentence into a single whole, and (b) the theory, 
which may be provisionally called th© theory of "members of the sentence". 

It has a long tradition and many opponents, but finds support among the 
major representatives of contemporary linguistics (Kurilovieh, Basel in 
part, Diederichsen) o The theory considers th© sentence a whole, the parts 
of which ar© linked together by functional relations, 

60 The direct constituent method, which is based on a single type of 
relationship— th© heterogeneity' of functions of th© constituents— leaves 
th© general problem of determination of syntactic relations open and in- 
vestigates for th© most part th© combinabi 1 ity of constituents and typical 
patterns e 

On the other hand, for the theory of "members of th® sentence* the 
problem of syntactic relations is fundamental <> Formerly, these relations 
were all too frequently distinguished purely on th© basis of meaning, not 
of formal criteria, although the inclusion of such criteria in the prin- 
ciple is desirable and feasible (Fries© , Togebyu) e Th© study of basic 
syntactic relations requires for its own continuing development that all 
modern methods of linguistic analysis be utilised, particularly the 
technique of distributive analysis,, 

Th® "direct constituents" and "members of the sentence" methods do 
not exclude on© another. Rather, they are oomplemsntary, as they permit 
the sentence to be studied in various respects, 

22 o THE LOGICAL NATURE OF CONTEXT 

Go V, Kolshanskii (Moscow) 

i 0 Th® term "context", given th© polysemia of language forms, may be 
defined from th® linguistic point of view as a combination of conditions 
determining th® simple , concrete identification of any linguistic phenon®nom 
(lexical, grammatical, etc,). By "simple" w® ar© to understand the display 
of only on© of th® many possible properties of the form under the given 
conditions (®,go on® meaning of a word, on© word order, on© intonation, eto«)o 
In this report w@ ar® considering cases of determination of meaning in 
polysemantic words regardless of th® method of origination (metaphor, 
mstonoxry, homonymy, ©tOo)o 
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2 * Contextual conditions may be found within the language itself, 
but they may also comprise indications lying outside language material,, 

Among the language conditions it is necessary to distinguish between 
indications included within a single sentence and textual indications 0 
Among the external conditions it is necessary to differentiate between 
situation, object, and graphic indications, 

3. The combination of possible conditions called context may be realised 
while the precise meaning of a sentence is being formulated in language only 
through a definite, aotive, logical process, since indications by them- 
selves are inert and can influence the meaning of a linguistio form only 
as a starting point in the functional process of achieving a result that 
makes sense 0 

Since the method of search by context is effective in the semantic 
area of language, it is in essence a speculatative , logical process of 
reasoning about the meanings of language forms 0 

This rational search for the essential and uniquely oorreot (in the 
ideal approach to a solution of this problem) result is a process of con- 
structing a syllogism or chains of syllogisms where the answer needed to 
establish the true meaning of the word and sentence is the final deduction* 

4. A syllogism is constructed by searching for the appropriate 
premise of a universal hypothetical syllogism (if 0 . . . .then) or by inter- 
preting a hypothetical-disjunctive judgment, the complexity of which de- 
pends on the character of the indication underlying the premise. 

While searching for the unknown meaning through external extra- 
linguistic indications, a syllogism is formed in accordance with the 
nearest indication contained in the context (e.g* determination of the 
meaning "table" as a piece of furniture in the sentence "He has a good table" 
is based on situation* The meaning "table" nay be "either A or B". Here 
it is not B. Therefore it is A, i.e* a piece of furniture* 

5* Mention of the subject is sufficient for the major premise in order 
to determine the meaning through objective context (e.g. determination of 
■the meaning of the word "solution" in chemistry and eleotrioal engineering 
is made in similar fashion). The form of writing in a written text may 
also serve as a starting point for a syllogism about the true meaning of 
a word (e.g. a foreign spelling). 

6 0 The method of searching for the determining faotor through lexical 
environment is the most familiar way of determining the meaning of 
linguistic forms* The premise is based on the immediately adjacent word 
(starting of a Sputnik, starting of a motor) and a word standing in any 
position in the appropriate group (an effective operation to destroy. *.*a 
hostile garrison, vermin, tumors, etc., where all the semantic variants 


- 28 - 


Approved For Release 2000/08/24 : CIA-RDP68-00069A0001 00200007-9 



Approved For Release 2600108124 : CIA-RDP68-00069A0001002QQp07-9 


arcs included 
syllogism) o 


members in the major premise 


an 


„ 7 \ Af th * r ® ar * insufficient resources within the sentence * 
^aniag of a wort is sought by forming several syllogisms to search 
premise of the last conclusion ©a the basis of the entire paraer&oh 
..©«e« wo did not allow otur house to fee " — *“ 


th» 


re< 


. if ** I s not ' e question here of a concrete house and family « then 
word •house mast be understood to mean "company" Abusine*s 7 „ After' 
examination of ^ the text* the first two. souses are 'set aside and the waning 
Compaq remains as an affirmative answer according to the rule for a S 
disjunctive syllogism. In German alls Reader stehen still /all tfc* *wi. 
are standing stiljj^a similar analysis for tli» masalne^ 


®P / o 


ihe process of ascertaining the true meaning may he lw 6 * w »xjLy 
? tt !l y a ^npo^hetioal-disjunotive syllogism* but depending on the 
or tbt desired result the conclusion may he reached either by 
part f of th ® disjunctive judgment (given the possibility ©f 
. a»yation of all the meanings of a word) or by first forming a 

ijunctive judgment (meaning * the word A may he either, or...), it should 
axs© be kept in mad that each operation i» subject to reoheeking* 


..... fc ° : wmi S r « i ® context is essentially a r»wxw«* io 

proassa* it ean in principle be. theoretically formulated as an 
ordinary logical operation and be performed by a 

(S-. <** A ** Jt !«».£ «n 5 jl i »> w 


S-Di 


is the decisive factor is a giver, caee 0 


any arrangement in connection with machine translation 


vimpue i armies in. a context the formalised operation t© search 
for the necessary meaning may bs worked out by introducing a simple 
quantor (a th@raat.ia quantor), When the meaning of a word is being inter- 
on the basis of immediate environment* a virtually disjunctive 
a -y te set up* Obviously consisting of up to 8-5 'words occurring 
before and after a polysemantic word^ provided* however* that preliminary 
linguistic analysis determines all the oases where the meaning of the 
given word depend® on words capable of being associated with ito At 
present, stage this work can he performed only for a 
in o~— 


iOo If the context extends; beyond a sentence* the solution of & dis- 
junctive syllogism in practically impossible* sine® one cannot formally 
Iaen , '° a indications ©a the baa is of which the part® even of a fuliv set. 
up disjunctive judgment will he eliminated . ' * 
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Likewise formally insoluble is ths problsn of ascertaining ohs limits 
of the operation to analyse context (both within a sentence and B much more 
go s outside it)„ This is the realm of active,, ©reatiw® thought o Thus* a 
complete 0 practical solution to the problem of mechanical determination 
of word meaning by context is excluded o Formal! nation of the rules for 
interpretation of context in machine translation clearly requires the 
application of statistical methods for probability determination of the 
contextual meaning of the words o 

gS P LINGUISTIC STATISTICS FROM RUSSIAN TEXTS 

R 0 Go Kotov (Moscow) 


1 0 The development of machine translation and the application of 
methods of analysis and syntheses to communications technology have created 
sound conditions for expanding cooperation between linguists and engineers 9 
In this connection there has arisen a need to introduce into linguistics 
objective research methods permitting mathematical handling of the data., 
Linguistic statistics * which operates with quantitative values „ offers wide 
possibilities for linguistic research,, Linguistic statistical data are ^ 
used to solve a number of problems in machine translations and communications 
technology o In addition,, they may be successfully exploited for lexico- 
graphical purpose® and for foreign language teaching 0 

Z 0 The current statistical investigation of Russian language text® 
aim® at preparing preliminary data in connection with constructing the 
program of lexical coding of telegraphic messages „ The work was first done 
by hand on specimens of texts containing a total of 20 a 0Q0 words 0 Methods 
of analysis were determined by th© existing possibilities and research 
goals » The texts to be analysed were entered in order on index sards in 
th© form of two— member word combinations 0 which mad® various types of 
/ calculation possible o 

It is proposed to use in the future machine methods for several 
tabulations p @ 0 go word frequency c 

So The treatment of the material has yielded thus far a frequency 
glossary D glossary of stable word combinations B and data on th® frequency 
of endings o Sons® principles governing th® statistical distribution of 
-the glossary for th© texts examined are elucidated on this basis o 

4 0 Superfluousness in Russian texts of th© type investigated is being 
determined by taking cognisance of probability correlations in the glossary., 
4L theoretical limit to th® savings expected from lexical coding is being 
ascertained., Lexical coding is regarded her# as a particular case of de- 
correlation ^SEKORREIATSIl^mes sagas by consolidation /UKRUFNEN fl Y^/o 


a> 30 *=> 


Approved For Release 2000/08/24 : CIA-RDP68-00069A0001 00200007-9 



Approved For Release^dOO/08/24 : CIA-RDP68-00069A0001 00^007-9 


So Work is going on. to s Elucidate th® main types of two-member word 
combinations the sequence of which makes up a text, ascertain the provisional 
probabilities of endings, and eliminate the uncertainty of choice of gram= 
matieal form in relation to the preceding word 0 Data obtained on th© 
material of two-member word combinations ar© assumed to apply to multi- 
member word combinations and to th® sentence as a whole* 

24 o A METHOD OF DEFINING GRAMMATICAL, CONCEPTS 

0. S. Kulagina (Moscow) 

1« Inconvenience of existing grammatical systems for machine trans- 
lation and need to elaborate precise definitions of concepts* 

2 a Initial base of undefined concepts? word sentence and OTMECHEMAYA 
sentence, environment* 

3* Breakdown of multitudes of words into submultitudes ^DDMNQ ZHESTVAj/, 
consolidation of breakdowns* 

4. Concept of B-equiv&lenee , amalgamation of B-equivalent submultitudes. 
Derived breakdown* Theorem concerning the impossibility of secondary 
amalgamation by equivalence. 

6. Sequence of amalgamation of words* families, classes, types. Con- 
cept of a simple language. Two definitions of type and their equivalence* 

So Determination of configuration, resultant element, ranks of con- 
figuration®* 

Concept of subordination of configurations* 

Determination of relations between elements of eenf igurationso 
25* A FORMAL THEORY OF THE SENTENCE 

I. I. Revzin (Moscow) 

1* More' than 200 different definitions of “sentence" make, on the 
on© hand, a deductive development of syntax imp ossible and show, on the 
other, that the approach to the problem of defining basic linguistic units 
requires greater precision* 

2* Any definition of a language element is a metalinguistic expression 
(explicit or implicit)* “Sentence” as a language word is, by its nature, 
different from “sentence” as a metalanguage word. Therefore, the aim to 
include in the definition everything that is intuitively understood when 
th© sound complex “sentence”!® pronounced is scarcely realizable* A term 
in linguistics, like an expression in metalinguistic®, may reflect only 


«=3 33L cs 

Approved For Release 2000/08/24 : CIA-RDP68-00069A0001 00200007-9 



Approved For Relea 


sl^b 


00/08/24 : CIA-RDP68-00069A0001 




0007-9 


certain intrinsic features corresponding to the usage of the word. 

3. An analysis of existing definitions of "sentence* makes it 
possible to divide them into two unequal groups. The overwhelming number 
of definitions are connected with the purpose of the sentence, i.e. they 
include mention of the fact that "sentence" is a language unit serving to 
convey a "more or less complete thought". Only a few definitions are 
based on particularly formal criteria. 

4. The defect of "sense" definitions lies chiefly in the fact that 
they violate the principle of homogeneity, they depart from the sphere of 
language as a system and assume or sanction the dissolving of an object of 
linguistics in an object of logic or psychology. Moreover, phrases libs 
"rno^e or less complete thought" and even simply "complete thought" are not 
defined mar® or less strictly in logic itself. The linguists are. thereby 
doomed to waiting passively for the progress of logioal semantics which 

it is easy to demonstrate, cannot itself develop without greater precision 
of linguistic concepts. 

5. The defect of existing "formal" definitions as compared with 
"sense" definitions is that they lack the idea of syntactic coherence 
(according to Aidukevich), i.e. what is most important in this unit of 
language for a linguist. 

Sentence "coherence" is reflected, as a rule, in the "sense" definitions, 
but it is reflected functionally through the coherence of the judgment. 

6. The formal definitions of "sentence" y?REDljOZHENIY^ coincide in 
substanoe with the definitions of "sentence" /FRAZTj 7. Meanwhile, the 
linguist is acutely aware of the need to distinguish between the two con- 
cepts . 

7. The theory-of -numbers conception of language created by Soviet 
mathematicians is a completely explicit metalanguage of linguistics in 
which the basic linguistic categories may be rigorously defined. 

8. In particular FRAZA ^/sentencei/, i.e,, the ordered succession of 
smaller units is taken as the original, undefined concept (the aggregate 
of meaningful or correctly constructed sentences in a certain language is 
considered given). 

9. Introduction of the concept of coaf figuration, strictly defined in 
netalinguistic terms, makes it possible to describe a relation of syntaotio 
dependency, while the isolation of regular configurations enables us to 
obtain the complete analogue of a "syntagma" or "word combination". 

10. The individual elements (parts) of the syntagma (they are described 

in the formal system as S-groups or re late me a LYATEM/7 ) may be regulated 

by the relationship of syntactic subordination. It is Tn these terms that 
the concept of coherence is formulated. 
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llo A sentence may be called a cohesive number of S-group® (or 
relatems) such that for each S-group A there is on® and only one S-grcmp 
B such that B is syntactically subordinate to A« 

12 o Galling two S-groups linked by mutual syntactic subordination a 
predicative pair leads t© the following theorems a sentence has on© and 
only one predicative pair® 

13 0 The suggested definition meets all the requirements set forth at 
the beginning of this reports it is formal* reflects the idea of ©oherenc®* 
and is sufficiently close to the intuitive conception of the term "sentence . 
It also permits us to derive deductively the ids® of predieativityo 

14o Exclusion of the so-called "single -constituent sentences" is 
justifiable on two min ground®* firsts whatever may be our definition 
of sentences "single-constituent sentences'" cannot in general be taken into 
consideration because the method of configurations is not applicable to ^ 
theme Second* the problem of correlation ©f "single-constituent sentences 
with a judgment cannot bs completely solvedo And it is important for ns 
that the "sentence 1 ** determined by particularly formal tiseans* rmy be placed 
in mutually well-defined ©©ngruene® with the judgment 0 Thu® * a strictly 
formal definition of the sentence i® important even for logic* 

26 0 TRANSLATION sub specie structuralism! 

Ao Ao Reformatskii (Moscow) 

1 0 Translation results from the variety of languages and the consequent 
lack of mut ua l understanding between their speakers o The purpose of trans- 
lation is to supply necessary information (businssa* scientific* artistic a 
etc 0 ) in language comprehensible to a given user of the informationo 

2 0 What is the "theory of translation** and can there be a special 
science of translation? 

Criticism ©f "literary expansion" (Lo No Sobolev and* in part* Xo 
Etkind) 0 Where A® To Fedorov is wrong in including the "theory ©£ trans- 
lation” in linguistics o The "theory of translation" not as a science* 
but as an object of science* even various sciences 0 The rol® of linguistics 
in the "theory of translation,® 0 

3o Types of translation 0 Narrowing of "scope of translation" in the 
usual view* Where Lo No Sobolev is wrong in considering translation 
limited to three types „ How "type of translation is defined „ A given text 
and the goal of translations What is the structure of a given text in its 
known linguistic features and in its social trends « The linguistic features 
of a given text as determining the typ® ©f translation^ Relevancy and non- 
equivalence ©f translation elements in various types of transiatioa 0 
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4, Translation as information and interpretation, What the structure 
of translation” as a whole consists in, Initial data of translation, act 
of translations, results of translation in the structural sense , Where Z. 
Klemensevieh and I. Etkind are right and I, Kashkin is wrong, Various types 
of relations between original and translation, 

5, The problem of ”translatability” and "non- trans latabi 1 ity" . What 
‘•lack of mutual understanding" consists in. Why Humboldt is right and Kashkin 
is wrong. The unwarranted claims of A, V, Fedorov and others. What is ad- 
equacy of translation in connection with analysis of translation elements 

and understanding of translation type, 

6, Methods and circumstances of translation dictating various 
solutions of the translation problem. Ad hoc translations * translations 
are the task of a lifetime ”j lexicography* informative translations* 
technical and scientific translations* artistic translations * translations 
of philosophical texts s machine translation. Cooperation of soienoes and 
talents in diversity of translation activity, 

27. A SYSTEM OF RECORDING SPEECH FOR ORAL TRANSLATION 

V, Yu, Rozentsveig (Moscow) 

1, Oral translation differs from translation of a written document 
in that the words to be translated are perceived by the ear* transformed* A 
stored in the memory* and later delivered orally. These operations take 
place more or less simultaneously (depending on the kind of oral trans- 
l&tion) o 

2 0 The limited capacity of the "short” memory of man results in 
considerable losses of information when large segments of speech are trans- 
lated, Moreover* overloading the memory makes the analysis and synthesis 
of a spoken communication difficult. It is necessary to work out a system 
of recording speech constructed in such a way that it would interact with 
the short human memory* thus ensuring reliable storage of information, 
facilitating perception* and recreating the oral message, 

3, A phonetic (alphabetic) writing system has not been devised for 
the recording of foreign speech. Stenography is unacceptable for oral 
translation because it registers the words in tot© (including redundant 
and unnecessary words) and requires too much time to decipher. The system 
of shorthand worked out empirically in the University of Geneva’s School 
for Translators largely meets the needs involved in recording speech for 
oral translation (Rozan’s work). However* it does not solve our problem 
owing to its unsystematic nature and internal contradictions, 

4, The task of developing an efficient system of recording speech for 
oral ‘translation amounts to the creation of* a unique elementary information 
language requiring the solution of several logical* psychological* and 
linguistic problems* to wits 
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(a) logical analysis of speech, isolation of semantic fulcra! 
points and systems of connecting th@m.§ 

(b) identification of the properties and action mechanism of 
the "short* 1 human v&moryt, 

{©) deter min ation of linguistic redundancies 0 common (stereotypic) 
word combinations and sentences capable of "being reduced to 
symbol® , the most efficient techniques of designating 
morphemes and syntactic connection# ia the system of the com- 
pl©x whole o In addition,, w© must keep in mind, the necessity 
of working out a recording system that will "be applicable to 
a pair of languages and easily mastered by those studying to 
besoms translators o 

2. 8 ° LANGUAG E TRAI N ING FOR BLIND DEAF-MUTES 


lo The simultaneous lack of visual and aural analysers and thereby 
of the speech analyser is an exceedingly unusual condition for a child. 

The unusualness consists in the" fact that the deaf -dumb-blind child is 
completely normal as far a® neural and cerebral structure is concerned arid 
therefor® retain® potentially the full capacity for intellectual develop* 
msnt like that of any normal child® Nevertheless 9 using just his own. effort® 
and without outside help he ©an not make initial contact with the external 
environment surrounding him® 


2„ Development of a deaf«dumb®blind child «s first contact® with his 
©nvironmant is an ©xtramaiy complex problem that can only fee solved by 
selecting a. rigorous system, of initial signals® This is achieved by apeeis 
teaching and a special grammar® Ordinary general (particularly "school”) 
grammars a® presented in. general courses cannot b® used. 

So If the system of initial signal contact® is developed in close 
conformity with the logic of the external physical environment, formation 
of the second signalling system on the basis of the first is not particu- 
larly complex and is chiefly a technical problem® The heart of the matter 
lies, therefore , not in the second, but ©a the first signalling system® 

4. The second signalling system (language) in teaching a deaf -dumb- 
blind child has various forms s gestieulatory, dactylic, touch (Braille), 
written, oral® The second signal® mast be strictly used in the same order 
as listed above® 

A text is a basic link in the second signalling system— but ns^ 
separate word® or separate sentences® Hence, the language instruction of 
a deaf -dumb-blind child must begin with texts, not separate words or 
sentences . 
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6. The beginning texts are shorty consisting altogether of 3-4 so- 
called "simple" non-extended sentences (two-member ) „ Five or six of these 
are enough, after which the child can pass to texts composed of "simple * 
extended sentences which/according to the rules , must include objects 
(direct or indirect indiscriminately). 

The remaining syntactic constructions - even such difficult ones as 
complex clauses - are assimilated with the "simple" extended sentences in 
the series of texts. What general grammar calls a "simple" sentence is 
not at all "simple" as far as the teaching of deaf -dumb-blind children is 
concerned. 


29. SOME GENERAL PRINCIPLES IN COMPILING GLOSSARIES 

For machine 'tranSUti oiT 

Go M. Strelkowskii (Mosoow) 

1. The word as a basic unit of language. "Every word (speech) 
generalizes" (I^nin Philosophical Notebooks). Since ideas originate 
simultaneously with words and are expressed through words. The very 
possibility of logical thought is created solely by language. The unity 
of language and thought is organic , i.e. language can neither arise nor 
exist without thought, nor thought without language. However, words are 
not identical to ideas. Words may have several meanings, i.e. they may 
express different ideas and, vice versa, one idea may be expressed by 
several words. A word may contain not only the expression of an idea, but 
also the relation of the speaker to the object designated by the given 
word (KHOI0D /Jo Id/, EHOIODISHCHE /extreme cold/ KHOIODOK /slight cold/etc . ) 

2. In this connection one should mention the impossibility of de- 
scribing language without referring to meaning (the weakness in the theories 
of American structuralists and their followers). The unsoundness of 
theories reducing language to a system of pure relationships (Yel'mslev). 

3. In accordance with the considerations set forth above, algorithms 
for machine translation must be based on a dictionary of meanings. 

4. The principles of word choice for a machine dictionary. 

(a) Significant and auxiliary words. 

(b) Division of significant words into technical terras and words 
in common use. 

/ 

(o) Need to ascertain the minimum of international words required 
for comprehension of technical texts. 
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(d) Choice of subjects (Electronics, particularly the section 
dealing with automatic control, since this field is now in— 
eluded in all branches of industry and science, and is a 
basis for machine translation itself. Competence of author). 

(e) Problem of compound words and word formation. Regular and 
irregular translation of compound words. Glossary of steins 
and program therefor or information referring to cables of 
suffixes and paradigms of word changes (including stem changes, 
e.g. stem forms of strong -verbs). 

5. Word combinations and phrases. Providing words in the glossary 
with an index indicating possible stock phrases. Translation of lexical 
homonyms by the method of analysing word combinations. 

6. Methods of work in compiling dictionaries. Choice of articles, 
reading them, writing out all words, except the commonest helping words 
(auxiliary verbs, pronouns, prepositions, etc.) ori index cards; alpha- 
betic arrangement of cards. Numbering sentences in the text and correspond- 
ing index on the cards for ready location of possible occurrences of the 

word. 

7. Statistical conclusions. Alphabetic arrangement of words. Per- 
centage of technical terms. Repetitiousness of non-technical terms. 

8. Methods of expanding the glossary with and without the machine. 


Reading of other materials on the given subject and enrichment of 
glossary with common words o 

Inclusion within the glossary of all technical terms already selected 
in the special glossaries of technical terms on the given subject. Treat- 
ment of new texts by the machine with separation of words not known to it 
and presenting them untranslated, or simply a selection of new words. 

9. A selected glossary aS a foundation for constructing a trans- 
lation algorithm without the creation of some metalanguage. 


30. SOI® ANALOGIES TO THE PROBLEMS AND METHODS OF 
C0N TdMp6RAiiY mma: LING U ISTICS ts 
INDIAN' GRAlMJlCAL WORlST 


V. N, Toporov (Moscow) 


1. Linguistics has perhaps never been so independent and complacent 
as it is today. This is undoubtedly due to the fact that the real object 
of the science has been found. On the other hand, the connection between 
linguistics and other sciences has never before been so strongly felt. 
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But this connection is effected not on the earlier basis* when attempts 
were made to apply the methods of one science! to another „ but on a new 
ohd ° It is characterized by some ideas common to a number of sciences* 

These ideas developed (often independently) on the soil of the various 
sciences o The isomorphism of certain fundamental concepts (of * "structure", 

5 inva riant”, etc©), th© similarity of individual problems and 
methods of solution* It is becoming increasingly evident that certain 
common ideas and methods are being superimposed * as it were* on the material 
of the particular sciences and transformed in accordance with the nature 
of the material, the possibility of giving it a strictly formal inter- 
pretation* the scientific traditions in the given field* etc* For this 
reason the prospects for a new synthesis of various sciences on a new basis 
are now being carefully assessed (cf* International Encycloped ia of 
Unified Soienoe * vol 0 I, 1938-1939$ Bo Hansseru The concept of field as 
a synthesis of natural science and humanities traditions in sociology* 

i ? tor f 1 kul'tury /fierald of the History of World Culture?, 

1957* no 4* eto 0 ). ” 

At this time when linguistics is very clearly aware of its plaoe 
among the other sciences and the new direction in linguistics is inter- 
preted as being something broader than simple opposition to old ideas, it 
is natural that there should be growing interest in the outlook for the 
development of linguistics* the nature of its connections with other 
sciences* and the ultimate fate of these connections « 

When one examines these problems* it is difficult to avoid thinking 
about certain striking analogies to modern linguistic problems that may be 
found in the history of ancient Indian science* particularly linguistics, 
and which are attracting the attention of modern scholars with increasing 
frequency (Lo Bloomfield* Emeneau* Bro* Allen* Renov* and others), 

3. let us list the most important analogies in the light of con- 
temporary problems * 

(a) Formal principle of language description ("descriptivism"), 
exclusion of moaning in analysis if w© disregard the very small n umb er 
of SUtra-interpretations that sometimes deal with the determination of 
connections of semantic (according; to Morris) orders fullness of 
description* including differentiation between the obvious and the non- 
obvious ) * 

(b) Elements of a systematic approach to language % clear 
destinction between class and member of class with fixed place? hence* 
on the one hand* the concept of zero* on the other* potential forms* 
hypergrammaticisms * false variation (often supported by the striving for 
conciseness in exposition); contrast of Spho ta-s abda $ negative character- 
istics of members in relationship? Frabhakara«s teaching on semantics- 
schools on relation of word and sentence and the dependence of the former 
on the latter? distinction between signum-des ignatum-denotat*™* - 
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(o) The metalanguage of Indian grammatical treatises; symbols 
(sign-index, sign-symbol types); metalanguage grammar in cognition, 

(d) In connection with these features of ancient Indian linguistics, 
mention must also be made of similar phenomena in other fields: The esthetic 
oode in ancient Indian art, particularly in the drama; the concept of dhavani 
(an analogy to sphota ); some analogies in the wor&s of ancient Indian logicians 
and philosophers ( categories of relation, time; "nominalism**); characteristics 
of Indian historiography; the concept of zero among the mathematicians of 
ancient India, etc, A oomparisisl with ancient Greek science enhances the 
significance of He specific features of ancient Indian grammatical literature, 
which in many respects resembles modern linguistics* 

51. THE FREQUENCY OF LEXICAL UNITS IN ENGLISH 
CrlSO 10 (jICAL~ literature 


M« G, Udartseva (Petrozavodsk) 

!• We undertook a study of frequency of lexical units in English 
geological literature in connection with the compilation of a minimal glossary 
for students in geological institutions. As material we selected articles 
on the various branches of geology as well as on the allied sciences. In 
addition, for the sake of objectivity in the tally, we included a considerable 
number of authors from several English-speaking countries. The final listing 
of sources comprised 33 works containing a total of 250,000 words, of whioh 
28 are articles from 14 periodicals published in the United States, Great 
Britain, Canada, India, and Australia, while 5 were excerpts from monographs, 

2, The literature dealt with problems related to the following branches 
of geology: mineralogy, crystallography, petrography, petrography of sed- 
imentary, igneous, and metamorphio rooks, petrology, stratigraphy, paleonto- 
logy, lithology, tectonics and structural geology, origin, distribution, 
and exploitation of mineral resources, geology of oil and coal deposits, 
geophysical methods, prospecting for mineral deposits, radioactive methods 
of determining the age of rooks, quaternary geology, geomorphology and 
glaciology, dynamic geology, geology of the ocean bottom, and regional 
geology. 


3, Individual words, phrases, and verbs plus post positions were used 
in the count. Each additional meaning of a word was handled as a separate 
item. For example, the word "face" was regarded as four separate words 
corresponding to the meanings of "side*, "face” (of crystal), "surface", 

"to put something in front of a person**. 

4. Each lexical item encountered again was entered on a separate 
index oard where all secondary usages with indication of author were noted. 
If the word occurred more than 100 times in different authors, no further 
entries were made. Such words as "that", "whioh**, "it", etc. were handled 
similarly. 
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5„ The count resulted in a determination of the frequency for 7555 
words* We entered into the minimal glossary 2375 lexical items consisting 
of 546 verbs , 954 nouns , 327 adjectives, 236 adverbs,, and 310 other kinds 
of words* Of this number 176 words are specialised terms s more than 200 
words have another meaning in geological literature, while the remainder 
are ordinary words* About 4000 of the 7535 words are technical terms. 

6* The min imal glossary was tested by taking several random pages 
of diverse literary, general political, and geology material and cal- 
culating the percentage of words from each text that were lacking in the 
minimal glossary. It turned out that a page of geological text contained 
1-1.5$ "unfamiliar" words, general political text 8-10$, and literature 
(Dickens) 16-18$. 

7* The minimal glossary was also collated with the Thorndyke 
dictionary. Significant discrepancies were noted even in determining the 
first 500 words. 

32. ONE APPROACH TO LOGICAL SEMANTICS 

V, K. Finn and D. Eh. Lakhuti (Moscow) 

1. Our approach to logical semantics can be summed up as follows* 

(a) some language of science with minimal pragmatics is selected 
as the investigated language (e.g., the language of synthetic 
organic chemistry, formal genetics, classical mechanics, etc.) 

(b) an artificial language is constructed for the investigated 
language I and it consists of a glossary (class of basic 
technical terms and syntactic functors) and a class of indexes 
for the glossary as well as a formal syntax in which are for- 
mulated the rules for building sentences consisting of the 
indexes. A correctly formed sentence in language I is 
determined with the help of an algorithm constructed in the 
formal syntax. 

(c) Language I is expanded into language II consisting of language 
I. A list of descriptions of types of sentences in language I 
(examples of such types of sentences for the language of 
synthetic organic chemistry will beg sentences conveying in- 
formation about compounds! sentences conveying information 
about reactions $ sentences conveying information about re- 
action conditions) and a list of combinations of indexes cor- 
responding to the types of sentences formed. 
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(a) In accordance with the types of sentences algorithms are 

constructed in language II that discern the waning of these 

wvll V011O02 £ 

If sentence F is correctly constructed and all the 
indexes are replaced by dictionary signs and if the combination 
of indexes corresponding to F coincide with the combination 

80109 sentence ^Pes in language I, the 

algorithm will convert F into sign "S", if all the predicates 
of the corresponding description are satisfied for F; if even 
one predicate of the description is not satisfied for F, the 
gor i thm will convert F into an empty wordo 

In the first case we will say that "F has meaning in 
anguage I , in the second "F does not have meaning in language 
1 * however » the algorithm is not applied to F, w e will 

say that the meaning of F is not determined in language I. 

A descriptive syntax is formulated in language II. It 
consists, of suitable algorithms to discern the u»anings of 
sentences and a list of rules according to which maaningful 
sentences are derived from meaningful sentences. 

(9) 11 is subsequently expanded into language III in which 

©fmitions with reference to the properties of language I and 
its relations to the investigated language are formulated. 
Language III consists of language II and a list of definitions. 

Language III contains definitions of the concepts of the 
semantic completeness of language I, tranalatabillty (full or 
partial) of the investigated language into language I, in- 
terpretation of language I within the amalgamation of language 
I whe investigated language , explicitness of language I. 

and other semantic concepts. 

tt iS P° ssibl9 to construct a series of languages I, 

II, III for the investigated language, we will say that the 
semantic analysis of the investigated language* has been 
realized. If the investigated language is at least partially 
translatable into language I, it is suggested that ’’semantic 
analysis of the investigated language” can be effected by an 
automatic machine. 

(f) .Semantic analysis" is in the experimental stage, and that 
is why we speak about an ’’approach* to semantics, and not the 
construction of a deductive system of semantics. 

However, the deductive construction of a system of seman- 
tics is possible on the basis of experimental investigations 
of the languages of sole nee” (with minimal pragmatics). 
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In the preparation of this paper we have used the ideas and results 
of research in semantics by A. Tarski* K. Aidukevich, L. Hwiatek, I. Bar. 
Hillel, G. Curry* V. Quine., N. Chudraan , and R» Carnap* 

S3. SOME PROBLEMS CONNECTED WITH THE HANDLING OF 
Wtfna WTT’W AT.T^NA-Nfj'o iJM lN OTIROCTING 

tom z m im s mm vc x nsenm 

(A Statistical Inquiry) 


R. M. Frumkina (Moscow) 


The compilation of a dictionary of stems is a necessary stage in the 
task of constructing an algorithm of machine translation. By stem we 
understand the graphically invariant' part of a word* However* there are a 
number of languages in which the graphically invariant part of certain 
words - principally verbs with alternate forms* consists of one or two letters 
an inconvenience resulting in homonomy of stems. It is therefore necessary 
to separate only the purely standard endings (person* number, etc.;, and 
assuHB that a given word has several stems 0 

There are two possible ways of solving the problems 


(l) Enter into the dictionary all the stem variants of each word 
with plural stems, e.g. perfective and imperfective aspect, present and 
past tense stems, etc. We thereby inorease (and sometimes considerably; 
the size of the dictionary. 


(2) Select the most frequently occurring variants and enter 
them into the dictionary^ for the other stems , furnish the rules by which 
they are in son» manner to be identified or formed according to the stems 
listed in the dictionary. This would enable us markedly to reduce the 
size of the dictionary, but at the price of complicating the program* 

In order to determine the more efficient method, it will be neoessary 
above all to carry out a statistical inquiry concerning words with plural 
stem variants and their frequency. We are now analyzing the frequency of 
verbs with alternating forms in a Spanish scientific (mathematical) text. 
On the basis of data in the frequency dictionary of V. Garcia Hoz, all 
Spanish words with a frequency of more than 40 were first divided into 
olasses depending on the types of alternation. Then the frequency both of 
olasses and of individual morphological forms was determined from con- 
secutive material in mathematical texts . 

The data thus obtained clarify the principles governing the distri- 
bution of classes and alternating forms and enable us to make certain 
recommendations in compiling a dictionary and rules for handling stems . 
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34 o A LOGICAL ANALYSIS OF THE CONCEPT OF 
LANGUAGE 'STRUCTffRf 

S. K. Shunyan (Moscow) 

lo Modem structural linguistics interprets language structure on 
the Gestalt plane* ie, as a whole* the elements of which are connected by 
definite relations. 

2 0 If we consider that language elements interact on two axes— 
syntagraatio and paradigmatic— an interpretation of language structure on 
the Gestalt plane must be regarded as one-sided * we encounter wholes, 
the elements of whioh are connected by definite relations, only on the 
syntagraatio axis (such wholes, for example are syllables in phonology or 
sentences in grammar „ However, on the paradigmatic plane we deal not 
with wholes, but with olasses of ordered elements* the elements of these 
classes are interlinked by definite relations, but the classes can not be 
identified in any way with the wholes, 

3. There arises the need of defining language structure in such a 
way that the definitions may be applied to the interaction of language 
elements not only on the syntagmntio, but also on the paradigmatic axis. 

4. The new definition of the concept of language structure is based 
on the general concept of structure in modern symbolic logic where it is 
defined thus* the structure of a given relation is the property of being 
isomorphic with the given relation. 

Modern structural linguistics, as we know, distinguishes two planes 
in languages the plane of expression and the plane of content (phonology 
is inoluded in the former, grammar and lexicology in the latter). Since 
isomorphism exists between both planes, we may rely on the definition of 
the general concept of structure in symbolic logic and define language 
structure thus* language structure is the property of the relations of 
elements on the plane of expression and of the relations of elements on the 
plane of content to be ismorphio with one another. This definition of 
language structure is in complete accord with the research techniques of 
structural linguistics at its present stage of development. 

6. A logical analysis of the concept of language structure requires 
an operational approach to this concept. Accordingly, the report states 
how we should set up empiric operations by means of whioh language structure, 
as an abstraction, can be linked to genuine linguistic activity. 
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35. ANCIENT TEXTS AND MACHINE TRANSLATION 
(A formulation of the problem) 

V. Shevoroshkin (Moscow) 

1* There is no doubt that a great many philosophers , historians, 
ethnographers, and even specialists in literature have an acute need of 
Russian translations of a large number of ancient texts . 

2. The available translations are a drop in the ocean compared with 
the mass of ancient literary monuments* 

3. Texts in dead languages have one feature that distinguishes them 
from texts in modern languages, namely, •''he frequent impossibility of 
proving that the original author had in mind precisely what we "read into 1 * 
the text. 

4. The feature of ancient texts noted above has produced and is con- 
tinuing to produce numerous commentaries on these texts. 

5* The translator of ancient texts is in essence a commentator* Even 
the translator who strives for maximum objectivity inevitably introduces 
into his work many subjective elements, which vary in degree with the depth 
of his erudition. 

6. An investigator who requires the translation of an ancient text 
may also need a commentary, but his primary need is for a maximally ob- 
jective translation* Tlhen reading such a translation, he should confront 
the same difficulties that are mastered by a person who reads the text 
in the original. However, a translation done by a human being does not 
meet these needs for the reasons mentioned in. (5) above* 

7. Machine translation of ancient texts will enable a student to 
obtain exactly what he needs. "Interpretation 1 * of a text by a machine is 
excluded* The more "elementary", the better. 

8. Thus, machine translation of ancient texts is particularly im- 
portant, for the machine is not merely a substitute for a live translator, 
but - in this respect alone - it does what a person can't do. 

9* Certain characteristics of the ancient Indo-European languages 
enable us to assert that these languages are more accessible to machine 
translation than are the living languages. These characteristics includes 
Comparatively greater transparency of morphology and simplicity of syntax, 
numerous trite phrases, etc. This problem will be considered in detail 
on Sanskrit material. 

10. For the reasons set forth above machine translation of ancient 
texts into Russian is a problem that deserves detailed elaboration* 
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SECTION ON ALGORITHMS OF MACHINE TRANSLATION 
36 „ AN ALGORITHM FOR TRANSLATING FRENCH INTO RUSSIAN 

TOMigm? 

V. A. Agrayev 
(Gorki) 

The algorithm was designed for use in connection with an electronic 
computer of the GIFTI ,/Gor ’kovakii iss le do -rate 1 ' skii fiziko-tekhnicheskoi 
institut/Gorky Research Institute of Physics and Technolog^/possessing a 
limited memory capacity. The aim was to determine the translation capa- 
bilities of the machine as well as to check the operation of the algorithm 
with limited glossary and rules. 

The algorithm includes lexical routines* a glossary of stems, a glossary 
of phrases, and charts for translating polysemants. The stem glossary con- 
tains about 500 words. In addition, we prepared a large glossary (about 
1200 words) containing the full, original form3. The amount of grammatical 
information included with the words varies in the two glossaries* less is 
given in the stem glossary. The phrase search is based on the semantically 
pivotal word. The translation routines of polysemants oontain tests for 
contextual environment and the required meaning is selected accordingly. 

Analyzing rules determine the meaning of French inflections and^ de- 
pending on the governing words, establish the necessary grammatical forms 
of the other words. 

In the synthesis routines Russian word forms are constructed on the 
basis of grammatical information derived from the glossary and developed 
during the process of analysis. Syntheses is effected with regard for its 
applicability also to translating English radio engineering texts. 

Statistically chosen data were used in constructing the algorithm. 

37. PRINCIPLES IN THE CONSTRUCTION OF ELECTRIC 




N. D. Andreyev 
(Leningrad) 

1. The problem of electric reading devices (EChU) ^lektrochitgyushchiye 
ustroistva7 arises because of the slowness in preparing a text for machine 
translation, which is inevitable when a human being does this work (partic- 
ularly in oriental language texts). 

2. An electric reading device must be adapted for machine sensing of 
scripts of varying size, slant, proportion, and graphic shape. 

3. The different sizes, slants, and proportions of scripts may be 
reduced to a single standard by using the three-set system of varying curve 
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mirrors ^TREKHKDMPLEKTNOI SISTEMY ZERKAL PEREMEMOI KRIVIZNY^. 

4. Scripts of different shapes my he adapted for machine sensing 
by using the principle of key identification points /feUCHEVYKH 
QFOZHAVATEL v NIKH T0CHEKj7, the number of which cannot exceed 50 for Cyrillic 
and Latin | it may reach 100 for Arabic , Devanagari and their derivatives, 
and about 300 for Chinese and Japanese. 

5. The set of key points is individualized for each of the graphic 
signs and is interpreted for each language in accordance with a special 
program that constitutes the introductory part of the analysis in the 
appropriate algorithm. 

38. WORK OH AH IUD0MES IAN-KUS S IAN ALGORITHM 
OF MACHINE TRAffgUtlON ' 


N. D. Andreyev 
(Leningrad) 

1. The Indonesian language requires preliminary treatment of the 
words in order to strip their roots. Stripping of the root by direct 
resort to a dictionary appears to be impossible. 

2. Three factors make it difficult to strip the root* (l) the 
presence of initial and secondary prefix and suffix $ (2) internal sandhi, 
i.e. the phonetic interaction of morphological elements! (3) the presence 
of root reduplicators and polyreduplioators, which occur in two graphic 
variants. 

3. Much preliminary work was required for the statistical and 
structural investigation of Indonesian words. Different versions of the 
root-stripping program were based on this work. 

4. Processing the words in the root-stripping program makes it 
possible to proceed to morphological analysis, which is effected by 
a special morphological program that is often realized in a purely 
analytic way, i.e., without resorting to the output language, but by 
substituting words in their code hieroglyphic. 

6. Based on a certain working hypothesis concerning the structure of 
the Indonesian sentence, it seems possible to construct a standard analysis 
constituting the principal part of the syntactic programs it is only for a 
minor portion of the sentences that we need a non-standard analysis forming 
a more complicated but much less frequently used part of this program. 

6. The homonym and phraseology programs are operated after the first 
three programs are completed, relying on the hieroglyphic analysis effected 
therein. 


- 46 - 


Approved For Release 2000/08/24 : CIA-RDP68-00069A0001 00200007-9 



Approved For Release*2P00/08/24 : CIA-RDP68-00069A000 100300007-9 


7* The propositional and glossary program works ohiefly by oon-» 
version* i*e** according to the output language* 

8* Tables of pseudoroots and typical sots of morphologioal in-. 
formation aro being developed as necessary supplements to the main 
glossary* 

39* WORK ON A VIFl'HAMESE-RUSSIAW ALGORITHM 

of machiI© fjSCnaCvHolJ 


N* D. Andreyev* D. A* Batova* and 
V. S* Panfilov (Leningrad) 

1* The Vietnameso-Russian algorithm of maohine translation includes 
the following programs) 

(a) Glossary of binomials * 

(b) Glossary of roots* 

(o) Glossary of idioms, 

(d) Supporting program* ^DPORNAYA PROGRAM 

(e) Syntaotio program* 

(f) Horaonymio program* 

8* The glossary of binomials assumes the stripping of two-syllable 
Vietnamese words with their grammatical information* 

The glossary of roots includes monosyllabio words and their 
grammatical information* The existence of two glossaries is due to the 
problem of word boundary in Isolating languages* 

The glossary of idioms contains idioms* phrase combinations* and 
hard-to-trano late express ions * 

The supporting program serves to differentiate between parts of 
speech in those eases where the appropriate grammatical information cannot 
be precisely indicated either in the glossary of roots or in the glossary 
of binomials* 

The syntaotio program provides for an analysis of Vietnamese syntactic 
constructions . 

The homonymio program 1s designed to solve the problem of lexical 
homonymy within any single part of speech* The program deals principally with 
monosyllabio words* since homonomy is not characteristic of binomials* 
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3# In connection with the adoption of a syntactic standard, which 
consists in utilizing syntactic analysis to determine the parts of speech, 
the range of application of the supporting program is narrowed to ex- 
ceptions to standard cases 0 

4. Besides utilization of the supporting program, exceptions to 
standard cases may be solved by inserting appropriate corrections into the 
syntactic program# 

5# The supporting program i.s characterized bys 

(a) The ability of individual words to occur in a sentence as a 
substantive and a verb# 

(b) The fact that such words stand closer to the verb than to the 
substantive# Therefore, when used as substantives, theyofi 011 receive various 
grammatical indicators that are peculiar to substantives# 

(o) A number of verbs may be brought into the category of sub- 
stantives by means of appropriate auxiliary elements# 

(d) What has been set forth above explains the impossibility of 
accurately indicating in the glossary the part of speech of the words in 
question# The part of speech may be indicated only disjunctively# 

(e) Determination of the part of speech to which words of Ishe 
type in question belong may be made in each specific case with the help of 
oarriers of grammatical data located in the supporting program# 

40# WORK ON A JAFANESE-RTJSSIAN ALGORITHM 

wnmm tran&ution 


A. A# Babintsev 
(ieningrad) 

1# Work on a Japaneee-Russian algorithm was begun at the end of 
December 1957, using atomic energy texts# At this stage analysis of 
material iB limited to the simple sentence# 

2# Due to the fact that no reading devices are available for 
ideographic text, the Japanese must be transcribed into Russian before it 
is put into the machine# 

3# The structure of Japanese— -agglutination (substantive and verb 
in part) and inflection (verb in part and adjective) with the stress on 
agglutination— is responsible for the effectiveness and adequacy of the 
standard morphological analysis and determines the primacy of the program 
of standard morphological analysis In the set of programs. 
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4 0 The set of programs for the Japanese -Russian algorithm at the 
present time is as follows? 

(1) A program of standard morpho logical analysis (with referral 
to the glossary— "address" and withdrawal therefrom of certain grammatical 
information) ® 

(2) A program of standard syntactic analysis (based on a "working 
hypothesis’*) ® 

(3) A program of non-standard syntactic analysis (oases that do 
not fit the "working hypothesis")® 

(4) A homonym! o program® 

(6) A glossary of idioms 0 

(6) A synthesising program,, 

6® The minimum of information to be derived from text analysis is* 
for a substantive— case and,, in certain instances , numbers for a verb— tense, 
voice, mood, finiteness ? for an adjective—' tense ® 

6® The "working hypothesis", which is based on the laws of Japanese 
sentence structure, in broad outline consists of the following* 

(1) The first substantive in the nominative or principal case is 
the subject® 

(2) The last word before a stop sign is the final predicate? a 
verb in non-finite form is the middle predicate® 

(3) The direct object immediately precedes the predicate? the 
indirect object is found at some distance from the predicate® 

(4) A substantive in the genitive case, adjective and verb in 
the finite form preceding the substantive are attributes® 

7 0 We should like to direct attention to one of the numerous problems 
that have arisen in connection with our work on the algorithm® After 
analysing a Japanese text, from which information on number can be obtained 
only sporadically, it turns out that difficulties due to the inadequacy of 
information on grammatical number appear in the synthesizing program during 
formation of the output text® A solution to the problem of number in the^ 
synthesizing program is exceptionally important for a number of oriental 
Russian algorithms® 
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41. THE PROGRAMMING OF TRANSLATION FROM 

Tg j mgg ’im mot — — 

G. P. Bagrinovskaya and 
G. L. Gavvixova (Moscow) 

Program of translation, constituent parts, order of operation. 

Arrangement of glossary, difference in coding used in English section 
of glossary from coding in French section of glossary. Size of glossary. 
Glossary of phrases. 

Choice of homonyms, construction of complex index scales and omitted 
index scales. 

Operation of analysis program ("rolling up" formulas) ^ORMOLY SVERTKI7 
Program of synthesis of structures on the basis of formulas of synthesis. 
Morphological treatment of results of synthesis. 

Russian part of program of translation from English into Russian 
(utilization of programs prepared for Russian part of French-Russian trans- 
lation). Agreement in codings. 

42. PRINCIPLES IN COMPILING A GERMAN-RUSS IAN 
GLOSSARY OF POLYSEMANfS FOR MACHINE 

S. S. Belokrinitskaya (Moscow) 

Determination of the meaning of a polysemantic word that is appropriate 
in a given context constitutes one of the basic problems in machine trans- 
lation. 

This problem is being solved by compiling a glossary of polysemants 
which will make it possible to obtain the relative meaning of a word by 
an analysis of the surrounding context. In most cases it is sufficient 
to examine context within the boundaries of a sentence. 

A considerable number of words that have multiple meanings in 
the usual literary language have but a single meaning in mathematical 
texts, and the system of meanings for a number of polysemants is simplified. 
However, many German words, even in a mathematics text, have a large number 
of relative meanings, the determination of requires a rather com- 

plicated system of tests. 

The most numerous are prepositions and a group of verbs which are 
used with separable prefixes and which also form a large number of phrases. 
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The principal method of determining the relative meaning of a 
polysemant is structural— 'Semantic analysis of the surrounding context. 

In some oases grammatical forma of the given word or its environment are 
also analyzed . 

It is possible to isolate certain grotips with a monotypic system of 
meanings thereby simplifying the glossary and replacing in some oases the 
system of tests (or part of the system) by reference to the appropriate 

general rule 0 

We have also isolated a group of words united according to the principle 
of identical effect on the translation of prepositions and some verbs with 
extremely many meanings, which likewise permits of simplification of the 

routine . 


Methods of glossary treatment of different types of idioms and phrases 
have been worked out. 


The routines of polysemants also contain cases of lexical homonoray that 
are not excluded from the system that differentiates between the meanings 
of polysemants o 


The determination of relative meanings of polysemants by means of the 
glossary just described is not free from difficulties (in some cases a 
single sentence does not provide sufficient context, the translation of 
complex words, etc . ) „ However, these difficulties can, as a rule, be 

overcome o 


A check of the text shows that a complete satisfactory translation of 
the mathematical corpus can be achieved with the help of the above -described 
glossary of polysemants o 

43, MAIN FEATURES OF THE GLOSSARY AND 

gramwctel f or 

MACHINE TRANS I AT iC3 H UaXT ' 

I 0 K. Be 1 9 skaya (Moscow) 


1. The basic components of a system of machine translation from 

English to Russian as worked out in the ITM+ and /*See No. 2 for ex- 

pansion and meaning of abbreviations/Aoademy of Sciences, USSR are a 
specialized bilingual glossary and ^hree cycles of translation routines* 
glossary routines, routine for analysis of imput sentence and routine for 
synthesis of output sentence. 

2. The Anglo-Russian M.T. glossary now available has been designed 
for the translation of scientific literature dealing with problems of 
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applied mathematics % the solution of systems of linear, algebraic, and 
transcendental equations, calculation of the proper values of m atrioes, 
approximation of functions by means of polynomials as well as by trig- 
onometric functions, expansion of functions into series, numerical 
differentiation and integration, numerical solution of differential 
equations, and other problems of numerical analysis. 

The glossary contains 2300 words. Several works by English authors 
were used for compilation end checking, 

ffcxt cheoking of the glossary for translation of mathematical lit- 
erature yielded satisfactory results. Some 3000 sentences consisting of 
mere than 100 conneoted passages from the material of different authors 
were used as the oorpus, 

3, A glossary for the machine translation of scientific literature 
may be usefully divided into a series of independent "specialized" 
glossaries. Further specialization down to relatively independent fields 
within a given soience— mathematics, physics, and chemistry— is also 
worthwhile , 


This division serves two pumosesi it reduces the neoessary bulk of 
the glossary to the completely manageable number of 3000-3500 words and 
even more important, considerably reduces the amount of polysemy. 

The structure of the Anglo-Russian glossary for M»T. is such that 
its several sections may be expanded independently. 

The glossary has two min sections % 

I Single -meaning glossary and 

II Multiple-meaning glossary. 

Each section is divided into two subsections® 
la - glossary of terms, 
lb - glossary of words in general use, 

Ha - glossary of words with complete meaning, 

ITb - glossary of auxiliary words. 

In size, the multiple-meaning glossary takes up about l/d of the entire 
glossary which, in this instance, amounts to 458 words. 
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The problem of polysemy is satisfactorily solved by combining 
two methods* (a) narrow specialization of a series of glossaries for M„T. 
and (b) contextual (functional-semantic) analysis of words in the sentence. 
Experience shows that it is virtually unnecessary in scientific and 
technological texts to go beyond the "small context" (i 0 e ® one sentence). 

6® In order that the lexical analysis of the words be effected 
automatically (without human intervention), the M ffl T 0 glossary is accompanied 
by a series of speoial glossary routines that make up oyole I in the over- 
all system of translation routines. 


These includes 
lo A routine for 
2® A. routine for 
3® A routine for 
4® A routine for 
5® A routine for 

The last routine is t 
the practical points of vii 


obtaining the glossary form 
the grammatical analysis of 
the grammatical analysis of 
distinguishing homonyms,, 
the analysis of polysemy® 
ie most important from both 

(W <& 


of the words, 
"unknown words", 
"formulas". 


the theoretical and 


7® The lexical analysis, which is performed by means of the glossary 
and glossary routines, precedes the grammatical analysis and provides it 
with the necessary initial information in the form of the so-oalled 
"invariant characteristics" of each "known" word (i.e® entered in the M®T« 
glossary) and the syntactic characteristics of all the "unknown" words 
(not entered in the M C T® glossary) and the "formulas"® 


8® The grammatical analysis of input sentence® is performed by means 
of a series of routines in cycle II in the following order* 

1* Analysis of verbs ("verb" routine )i 

2® Analysis of punctuation marks § 

3® Syntactic analysis of sentences* division of sentence into 
clauses and more precise definition of parenthetical phrases 
in clause define a sentence as that segment of text which 
is included between full stops (period, exclamation or 
interrogation point) j a clause is a simple sentence, i.e® such 
that it contains no mar® than on® heterogeneous predicate^! 

4o Analysis of substantives and numerals g 
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5. Analysis of adjectives; 

6. Modification of word order in the translated sentenoe. 

The "verb* routine is the key routine in the first half of the analysis 
of English sentenoes; however, the ayntaotio analysis of sentenoes (routine 
8) is the basis of operation for the second half of the analysis and deter- 
mines the boundaries of those segments within which the subsequent analysis 
is effeoted. 

9, The routines in oyole III use the results of the preoeding routines 
in suoh a way that the Russian sentence obtains its grammatical form in 
acoordance with the rules of Russian grammar. 

The synthesis routines go into operation just at the time when the 
variant (oontextual) grammatical signs for all variable words in the output 
sentenoe are obtained and the steps taken to adjust the word order to 
Russian norms. 

In the plaoe of the Russian numbers, whioh represented Russian words 
up to this time, Russian equivalents are selected from the glossary, after 
which the variable words (verbs, substantives, numerals, and adjeotives) 
are handled by the synthesis routines * a word ending is changed whenever 
the desired word form does not coincide with the dictionary form of the 
word , 

10* Synthesis routines operate in the following order* 

1. Word-forming routine; 

2. "Verb" routine; 

5. "Adjeotive" routine; 

4. "Substantive" routine. 

Changes in the numerals are effected partly in the "substantive" 
routine, partly in the "adjeotive" routine. 

The word-forming routine oooupies a special plaoe* it provides for 
various cases going beyond word changes while inserting the grammatical 
signs of the Russian word derived from analysis of the foreign sentenoe. 

44. WORK ON A NORSHEGI AN-RUSS IAN ALGORITHM OP 

mmm 

V. P. Berkov (Leningrad) 

I. The projected set of programs are* A. Analytic part* (l) 
morphological program; (2) program for distinguishing homonyms; (3) syntaotio 
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pro gram | Bo Glossary parts (4) glossary— address f (5) regular glossary’s 
(6) glossary of phrases and idioms j (7) program for compound words; (8) 
propositional program; (9) program for unification of orthography; C, 
Synthesizing parte, 

II « Two methods of analysis, different in principle, were initially 
contemplated* 

(a) To begin with a search for words in the glossary; 

(b) To begin by extracting grammatical information from the text 
before referring to the glossary on the basis of a supporting 
program (list® of indisputable endings, word-forming suffixes, 
supporting words, etc©)® Due to the extensive amount of 
grammatical homDnomy in Norwegian, the second method seemed 
very cumbersome and, in some cases, practically unsound* It 
has therefore been rejected® 

III, The fact that the functioning of the algor ithm—whioh begins 
with a search for the words in the glossary and withdrawal of the infor- 
mation located there into the operative memory— leads to clogging the 
latter with information that is as a rule temporarily superfluous (in 
some oases this is general) suggested the idea of creating typical sets 
of information® 

IV, Programs (l) and ( 2 ) are now (beginning of March 1958) ready in 
rough form. An ending obtained by stripping the dictionary stem from the 
text form of the word is compared with the list of endings ; if the given 
word has a single grammatical meaning, an information suffix is attached 
to it and no further action is taken on the word at this stage® Cases of 
grammatical homonomy are handled by a. series of special programs (2). 

On extracting all the grammatical information from the text linear trans- 
fers of words are made in order to impart a standard appearance to the 
items derived by ^unrolling'* v URT KX7 * this is done by a part of pro- 

gram (3)® 

Vo The program for the unification of orthography is the specifically 
Norwegian part of the algorithm® The need for this program is dictated 
by the considerable amount of inconsistency in Norwegian orthography, even 
in scientific texts; without the pregram., the glossary would necessarily 
be overloaded with many pairs of words® 

VI® The program for unification of orthography will be used as a 
basis on which to construct an adjusting program in connection with the 
use of this algorithm for Danish® 
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46. GLOSSARY STRUCTURE AND INFORMATION CODING 

TO . mnm fmgrn g 

I. L. Bratohlkov, S. Ya. Fitialov, 
and G. S. Tseitin (Leningrad) 

1. Consideration is being given to the problem of introducing a 
glossary on tape into the machine to search for coincidences in the event 
that the glossary does not fit into the operational memory. 

2. A glossary structure is proposed that will accelerate searching 
and decrease the size of the indispensable portion of the memory for the 
size of the glossary under consideration. 

3. The previously suggested process of "rolling up the codes" 
^SVERTYVANIYA KDDOV^ is now in use. The rolled up oode is directly 
utilized to obtain the address of information on words in the glossary. 

We have provided for oases of coincidences of addresses thus obtained 
(rolling up SVERTOCHNAYA/ homonomy ) by differentiating routines in- 
cluded in dictionary compartments, the addresses of whioh are not 
addresses of the words. 

4. Theoretical probability considerations have enabled us to obtain 
results whioh, based on the given number of words in the glossary and the 
volume of lexical information, make it possible to estimate the necessary 
size of the memory to accomodate the glossary. 

6. Methods are also suggested for programming certain operators 
encountered in the algorithms of machine translation. 

46. GENDER AS A SUPERFLUOUS CATEGORY OF 
RUSSIAN VERB 


V. N. Vinogradova (Moscow) 

It is very important for machine translation to discover the gram- 
matical categories of a language concerning which there is no need to 
give information insofar as translation can be effected without taking 
them into account. Certain general considerations suggest that gender 
in the Russian verb— an uncharacteristic phenomenon expressed only 
in — 1 forms, the singular of the past tense and of the conditional mood— 
is one of these categories. 

We tested this assumption on a mathematical fl . . G. Petrovskii, 
Discourses on the Theory of Differential Equations , 19547text where the 
number of verbs with gender expressed turned out to constitute only 
(93) of the total number of verbs (1970). We then selected linguistic 
(history of language)^. A. Shakhmatov, Historic Morphology of the 
Russian Language , 1967, pages 9-65/ and kistorio /£. D. Grekov, Kievan 
Russia , State Publishing House of Political Literature, 1953, pages 
14 £/ texts in order to have a large number of diversified examples and 

of TOrts 
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used® It appears that in most sentences the verb may be related only to 
the subject— =a single substantive in the nominative case® Doubts may arise 
only in the case of transitive verb3 where there is an object in the 
accusative case that ooincid.es in form with the nominatives of the types 
"Equation (6) yielded the general integral of this equation over the 
entire surface except for the start of the coordinate"# /cfravneniye(6) 
davalo obshohii integral etovo uravneniya vo vsei ploskosti 3a isklyucheniyem 
nachala Roordinajy^ „ Since we have a grammatical indication for both the 
subjeot and the object purely in the past tense, and even then only when 
the gender of the noun-subject differs from that of the noun-object, there 
remains no other way of determining which is which than "the word order of 
the sentences the rule that the subject comes first holds in the over- 
whelming majority of cases® A rearrangement is, of course, possible for 
the sake of logical emphasis, e®g ,t "in the Russian language preponderance 
1ms received the accent of the nominative plural." russkom ya3yk» 
pereves poluohilo udareniye imenitel 'novo mno zhe a tvennovoT'o The phrase 
poluchlt 1 per eves ^/receives the preponderance « predominates^ will 
evidently have to be listed in the glossary as a phrase combination# 

It is possible to eonoaive of more complicated cases (we didn't 
find any examples, but we paraphrase one of the sentences of the type 
described above)? "Change a,®® caused a shift of js to © before a hard con- 
sonant #" ^Tzmeneniye ® » 0 vyzval perekhod e v o pered tverdoi soglasnojL^ 

Such a sentience is almost impossible with a predicate in the present tense 
(or is very badly written? "I zmeneniye o®»vyzyayet perekhod..." will 
clearly be misunderstood )g even in the past tense it is awkward® Apparently, 
rare instances of this kind will be edited! so too the following case in a 
complex sentences "The bishop asserted that his church land went along the 
Lisichii ford, which was in the time of Prince Yuri,” /Episkop utverzhdal, 
chto yeo tserkovnaya semlya idet po Lisichii brod, chto byl pri knyaze Yurii7 
In the absence of information on the gender of the verb byt' , it is impos- 
sible to determine whether the last clause modifies brod "ford" (brod, chto 
bylo* ®s kotoryi byl®®® an obsolete meaning, according to Ushakov's Tolkovyi 
Slovar ' ^cfictionar^) or is a subordinate conjunctive clause relating to 
the entire preceding clause. This ambiguity cannot be resolved here by 
formal signs® 

With the exception of the last example, the texts studied did not con- 
tain a single instance where the lack of Information on the gender of verbs 
would have resulted in confusion® This permits of the conclusion that as 
far as machine translation is concerned gender in the Russian verb may 
well be ignored® 
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47. THE SYNTHESIS OF RUSSIAN VERB FORMS II MACHINE TRANSLATION 

2. M. Volotskaya (Moscow) 

1. For the synthesis of Russian verb forms in machine translation it 
is proposed to list in the glossary of stems only the stem of the iraper- 

f active aspeOtive of eaoh verb. All the forms of the present, past, and 
future tense, perfective and imperfeotive aspeot (personal as well as 
impersonal) are formed from this stem in aooor dance with definite rules. 

2. It is suggested that three types of operation are sufficient to 
make all possible verbal forms from the single stem: (a) discarding the 
final letter or letters, (b) adding a letter or letters to the stem on the 
right, and (o) adding a letter or letters to the stem on the left. 

All the individual letters and combinations of letters whioh axe 
joined to the stem on the left and on the right are assigned by a list 
and arranged in tables in accordance with a definite system. 

3. All the verbs are classified in three groups depending on the 
msthod of producing: (a) the forms of the present tense, (b) the forms 
of the past tense, and (c) the stem of the perfective aspeot from the 
stem of the imperfeotive aspect. 

By class of verbs we mean the total number of verbs that construct 
a given form in the same way 0 

4. The information for each verb stem contains the class number of 
the stem, which indicates the way in which a given form is to be con- 
structed. 


48. RUSSIAN SYNTAGMAS 
(on the basis of mathematical texts) 

Zo M. Volotskaya, Ye. V. Padueheva, 
I. N. Sbelimova, and A. L. Shumilina 

(Moscow) 


1. This report discusses the basic types of two-word combinations in 
subordinate relationship ('syntagmas) as found in mathematical texts and 

by means of which it is possible to construct the rules of formal text 
analysis (for machine translation) . 

2. The syntagmas were based on specific word combinations drawn from 
the texts. 
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Syntagmas are considered to differ from each other in type of 
syntactic relations between their component parts* Therefore, not all the 
morphological and syntactic signs of the words that form the given com- 
binations served as oriteria for relating these combinations to the 
various syntagmas* 

3* A syntagma consists of two components * “governing 1 * and “governed** • 
Each of which is accompanied in the list of syntagmas by certain information 

As a rule, the “syntactic group" is the essential information for the 
"governing , the "morphological form" for the “governed" component. 

4. Words are divided into "syntactic groups" on the basis of the 
following principle of marking words according to the sign of a common 
syntactic connection* first, those words which have a single common 
syntactic connection are separated from the mass of words into one group j 
then, those words which have another syntactic connection are separated 
from the same mass, etc. The same words may fall into different groups 
which consequently appear to be crossing each other. 

The separation of syntactic groups not only according to one but 
according to a combination of signs should lead to a Significant inorease 
in the number of syntactic groups andj correspondingly, in the number of 
syntagmas * 

6. The report includes a list of syntagmas, description, and dis- 
cussion of possible ways of using them in text analysis. 

49. SYNTHESIS OF THE) RUSSIAN CLAUSE 

Z. M, Volotskaya and A. L. Shumilina 

(Moscow) 

1* Sentence synthesis in maohine translation consists of combining 
words into clauses and clauses into sentences according to the requirements 
for sentence building in a given language. 

2. The aggregate of syntagmas in each sentence that are obtained by 
analyzing the language from which a translation is made does not constitute 
an adequate basis for synthesizing sentences of the language into which 
the translation is made. Correspondences must be established between the 
languages in question not only on the syntagmatio level but also on the 
sentence level* 

3. A clause is synthesized by inserting a syntagma, i.e., one syntagma 
as it were overlays and draws into itself another. 

4. Each word in the clause of the output language obtains, in addition 
to the information necessary for translation (number of stem in the output 
language, number, tense, etc*), the following signs* (a) number of the 
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syntagma into which it is entering as a governing word (fcy the first 
nethod, cf. below) or as a governed word (by the second method); (b) 
ordinal numbers of the words (from the input language) with which the given 
word forms syntagmas. 

In combining words into clauses it is more convenient to use the ordinal 
numbers of words from the sentenoe of the input language and not the numbers 
of the output stems because using only the latter might lead to mistakes 
inasmuch as the sentence may contain several identical lexemes or different 
ones, but with the same stemo 

5. There are two possible ways of synthesizing a olause by means of 
syntagmas t 

(a) Isolating the pivotal syntagmas (predicatives) and successively 
expanding each component at the expense of the governed words . 

(b) Synthesizing a clause by successively combining syntagmas until 
they are reduced to the predicative. Moreover, eaoh syntagma enters as a 
single group into a higher rank syntagma as a governed, expanded component. 


60. 


GRAMMATICAL ANALYSIS FOR MACHINE TRANSLATION 

of rare msm 


V. A. Voronin (Moscow) 


The system of grammatical analysis for machine translation of Chinese 
into Russian was based on materials from contemporary scientific and 
technological texts in mathematics , electrical engineering and construction. 
It utilized the fundamental works of Soviet and Chinese authors on the 
modern Chinese language. The system was tested on mathematics articles from 
the Chinese periodicals Shusyue synebao (Mathematics Herald) and Shusyue 
tsin'chzhan’ (Successes of the Mathematical Sciences). In constructing the 
system we did not have the task of solving the extensive and manifold 
grammatical problems connected with machine translation of literary and 
socio-political literature. However, we did take cognizance of gram- 
matical phenomena characteristic of Chinese as a whole. 

Treatment of the Chinese sentence according to the system of gram- 
matical analysis starts after operation of the glossary and glossary 
supplement is completed, as a result of which words in the sentenoe enter 
the system with concrete relevant meaning and complete lexioal characteris- 
tics, i.e. with the set of necessary signs. 

The special grammatical structure of Chinese possesses an extremely 
small number of formal means by which one can identify the full morpholog- 
ical properties of the Russian equivalent for the Chinese word within a 
given lexical unit. Therefore, a Chinese sentence cannot be processed for 
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machine translation without an analysis of the syntactic structure of the 
sentence to be translated, which was predetermined by the general principles 
under lying the system. 

The system operated in the form of routines, oonsists of two main 
parts: (1) syntactic analysis of sentences, and (2) production of the mor- 
phological characteristics of the Russian equivalent. The entire system 
includes 9 interrelated, successively functioning routines. 

The first part has 4 routines in which the following stages of 
syntactic analysis are effected in corresponding order* 

(1) Breakdown of the input sentence into simple clauses. 

(2) Separation of attribute 4* attributed word groups. 

(S) and (4). Separation of other (than attributive) syntactic 
components of the clause. 

The second part of the system has 5 routines of which 4, on the basis 
of existing syntactic signs, produce the morphologioal characteristics 
for the Russian equivalents of all the words in the Chinese sentence. The 
classes of words mentioned below are handled in the order given* 

(1) Numeral, 

(2) Substantive 

(3) Verb 

(4) Adjective 

The operation of the fifth routine consists of changing Chinese word 
order in accordance with the norms of Russian word order. 

The system as a whole comas down to producing the formal signs that 
reflect in the first part the syntactic function of the word and in the 
second part the morphological features of the Russian equivalent of the 
Chinese word. 

An adequate, readable translation is ensured by performing a oombined 
lexico — grammatical analysis of the Chinese text put into the machine. 
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51 o APPLICATION OF MACHINE TRANSLATION METHODS TO 
THE LEXICAL CODING OF TELEGRAPHIC AND 

V, I* Grigor’yev and G« G. Belonogav (Jfosocw) 

1. Men have teen searching from ancient times for the most effective 
utilization of the channels of communication* Up to now the main efforts 
of engineers and communications experts have teen aimed at perfecting the 
communication channel proper and at seeking ways of transforming the 
Signal so as to secure the maximum suitability of the signal to the given 
channel* The contents of communications meanwhile remained unchanged. 
However, the possibilities have now for the most part become exhausted so 
that the problem of findihg means of reducing the size of messages trans- 
mitted is becoming increasing urgent. 

2. - The size of a telegram may be shortened 3-4 times if a lexical 
code is used instead of a literal code. A telegraphic communication 
that uses lexioal coding differs from an ordinary printed letter com- 
munication only in that they send not code groups designating letters of 
the alphabet, but a oode combination designating the ordinal number of 
the word according to the dictionary in the memory device plus certain 
items containing grammatical information about the word transmitted. 

3. The principle of lexioal coding of messages has teen known 
since ancient times. It is employed in various kinds of signal tables, in 
the international radio oode, and elsewhere. However, in all these cases 
coding is done manually, requiring great effort; and considerable expendi- 
ture of time. The development of computer technology has now made possible 
automatization of the process of lexical coding and its wide use in com- 
munications. 

4. Lexical coding is based on an analysis of the message at the 
transmitting end and its subsequent synthesis at the reception end of the 
line of communication. This lexioal analysis and synthesis of a message 
is essentially a simplified form of the analysis and synthesis of a text 
produced by machine translation. It is therefore worthwhile, when pre- 
paring an algorithm for lexioal coding, to make full use of the method of 
text analysis and synthesis used for machine translation. 

5. Lexical coding has, in addition, several peculiarities. Text 
analysis and synthesis in the case of machine translation is aimed at 
securing the operation of hieroglyphic conversion— a basic operation in 
machine translation. Elimination of hieroglyphic conversion would 
lead to considerable simplification of the routines of analysis and 
synthesis in the case of lexical coding. On the other hand, with lexical 
coding the demand for code economy is pushed to the foreground, whereas it 
is of purely secondary significance as far as machine translation is con- 
cerned. Lexical coding must rest to a large degree on speech statistics. 
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In particular, due to the interlinking of analyzer with the devices of the 
channel of communication, the size of the dictionary cannot be so conveniently 
large. Available statistics permit limitation of the dictionary of the 
lexioal analyzer to a maximum of 4000 words in ordinary use, whioh generally 
make up 97.5# of a literary text. Rare words not found in the dictionary 
may be transmitted letter-by-letter. 

6. Application of the principles of lexioal coding to telephonic com- 
munication may help greatly in solving the problem of maximum closeness of 
compression. 

52. SOME PROBLEMS IN MACHIN E TRANSLATION FROM 
JAPANESE INTO RUS kihM 

M. B. Yefimov (Moscow) 

The purpose of this communication is to set forth some principles 
involved in analyzing Japanese sentences for machine translation, the 
principles being characteristic of the Japanese language alone. 

A. The primary problem with whioh we have to deal in analyzing a 
Japanese sentence is its division into separate words. This is typioal 
chiefly of languages with an ideographic form of script (Japanese, Chinese, 
eto.). The fact is that words are not separated in a written Japanese 
text and, consequently, identification of their role in a sentence is quite 
difficult. 

We shall try to show in this report how we made the division in our 
work. 


We began with the fact that the Japanese script uses the signs of a 
syllabary (kana ) along with ideograms. 

Thus, the division of a Japanese sentence into separate words breaks 
down into 3 main steps t 

(1) Analysis of portions of sentences containing both ideograms 
and syllabary. 

(2) Analysis of ideographic part. 

(3) Analysis of syllabary part. 

This operation is closely linked to the operation of the existing 
Japanese glossary and is, so to speak, one of its parts. 

B. Breaking down a sentence into its individual clauses is no less 
important a problem in Japano— Russian translation and has both practical 
and theoretical interest. 
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In this work we are relying chiefly on the rigid structure of the 
Japanese sentence in whioh either a verb or a predicate adjective always 
stands at the end. This enables us infallibly to determine the end of the 
sentence • 

The beginning of the sente noe is determined by searching for the 
subject. 

Thus, the entire operation consists of two steps* 

1. Determination of the end of the sentence* and 

2, Determination of the beginning of the sentence. 

C. As is true of all languages, the verb constitutes the greatest 
difficulty in translating from Japanese into Russian. 

The strongly developed affixation that is characteristic of Japanese 
is most clearly marked in the verb. 

This determined the oyclioal nature of our operation. 

We used the fundamental rules of traditional grammar for the analysis 
of verb endings, relying mainly on the five steins of the Japanese verb. 

We have been successful In establishing the necessary grammatical 
and syntactic criteria for all verbs. 

53. WORK ON THE RUSS0-ENGLISH ALGORITHM OF 

TRAN&IATIO'W 


L. N. Zasorina (Leningrad) 

1. Limitation of problems and scope of work. Choice of mathenatical 
text as being most limited in stylistic peculiarities. 

Determination of set of programs for Russo-English algorithm. Ex- 
clusion of program of differentiating homonyms due to synthetic structure 
of Russian. Simultaneous work on glossary and morphology program. 

2. Combined investigation of short text. Compilation of glossary in 
which the grammatical form and syntactic relations of the words are 
registered. Recording of statistical data. 

3. Investigation of individual parts of speech, division of words into 
olasses, and preliminary detection of homonymy between the parts of speech. 
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4c Yerb and grammatical information derived from personal forma and 
nominal forms * Homo no my of participles and adjectives distinguished by 
taking into account suffixes of full and short forms of participles* 

Lack of formal— graphic separation of auxiliary and modal verbs from the 
verb class o 

Adjective class comprising adjectives^ adverbs in -o 9 -ski p 
ordinal numerals 9 words in the status cate gory 0 Arrangement” in non™ 
specified subclasses,, The substantive class including nouns s sub™ 
stantivized words and cardinal numerals (other than odin ^/one/ 9 dva wo/ „ 
tri 2thre^7, ohetyre /four/) is distinguished by the abundance of homonymic 
case forms® intra class homonoray and inter® lass homonomy 0 Separation of 
non-specified subclasses c Triliteral word class 0 Class of invariable words 
is characterized by negative separability in the texts 

4 0 Advisability of introducing stem-stripping programs Planning of 
groups of commands for the individual classes o Many-sided investigation 
of homonymic coinoidenoes of separable affixes,, 

So Problems connected with differentiating grammatical data derived 
from homonymic affixes* Tables of separable 9 restrictive lists of letters 
that precede the separable affixes. Successive separation of affixes from 
stem (endingrand form-constructing suffixes) and storage of grammatical 
information derived „ 

Table for verifying matching of preliminary information obtained 
from affixes and stem glossaryo 

Method of multistage depositing of grammatical information derived 
from the glossary and stem-stripping program* 

Attempt at dividing grammatical data into two non-crossing fields 
to reduce the number of tests of possible grammatical forms* 

6* Conqpilation of stem glossary* Determination of general size 
and limits of glossary* "Lexical article" plan p taking into account 
input and output information and list of possible forms o 

Obtaining pseudo stems* 

Problems in contracting the glossary by separating word-building 
suffixes and prefixes* 

7* General routine for processing words® stem-stripping program 9 
stem glossary, morphology program* Obtaining input information for 
the syntactic program* 
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54* WORK ON A HINDUSTANI (HINDI) - RUSSIAN ALGORITHM 
Of MACHllte flUUfg'lffEx ON “ 

T, Ye, Katenina (Leningrad) 

lo The development of a Hindi— Russian algorithm is very important 
for similar work in the field of Indian languages— both Indo-Aryan and 
Dravidian 0 The structure of Hindustani is in the main analytical, al- 
though the traces of ancient inflection and agglutinative elements— a new 
synthesis— play a definite role# The scientific style of Hindi prose is 
characterised by a more or leas definite word order close to that of the 
Dravidian languages o Numerous phrases containing non-con jugated verb 
forms, equivalent to subordinate clauses, constitute the main difficulty 
for machine translation Scientific texts are characterized by an abun- 
dance of Sanskritisms which are frequently translated loan words of in- 
ternational (European) terms, 

2 0 Hindi writing, phonetic for the most part, is therefore especially 
convenient for an electric reading device, To record texts we worked out 
a mechanical transcription based on the Russian alphabet without complicated 
signs and diacritics. In addition statistics justified our combining 
several Hindustani sounds, 

S, The set of programs for machine translation is as follows* 

( 1 ) glossary of stems ( 2 ) morphology program (3) postposition program ( 4 ) 
syntactic program (5) program for differentiating homonyms ( 6 ) list of 
idioms (7) a translation program of compound words may be required for 
some kinds of scientific texts, 

4 0 In order to avoid superfluous information we adopted the following 
hypothesis for the syntactic analysis of a simple sentence* (l) the first 
noun substantive in a direct or active case is the subject ( 2 ) the verb 
in the last place in the sentence is the predicate (3) if the verb is not 
a copula, the noun substantive in the next-to-last place in the sentence 
with the postposition bo or in the direct case (not the subject) is the 
direct object. 

We have determined the necessary minimum, of morphological information— 
but which requires statistical confirmation in individual cases— to be* 

(l) for the noun substantive— number case (direct, active, indirect), ( 2 ) 
for nominal adjective— number (may be important to determine the number of 
noun substantives with zero ending of direct case plural number), ( 3 ) for 
the verb — tense, mood* number (to determine the number of the same noun 
substantives ) 3 voice , ^ check of the text showed that the overwhelming 
majority of simple sentences as well as the constituent parts of complex 
sentences may be analyzed in accordance with these rules. 
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6 0 Among the basic problems requiring a solution for subsequent 
work in constructing a Hindus tani-Russian algorithm are* (l) eluoidation 
of rules for analyzing complex sentences and equivalent phrases with non- 
con jugated verb forms , (2) clarification on a statistical basis of the 
need to design a program analyzing compound words that would be compulsory 
for all kinds of texts 0 

55, AN ALGORITHM FOR TRANSLATING ENGLISH 

tex ts 6K mfo BMate ms mgr 


K, To Komissarova (Gorki) 

The translation rules and glossary have been worked out with regard 
for the characteristics of English texts dealing with radio engineering. 

The translation process is divided into 2 main parts* analysis of 
English sentences and synthesis of Russian sentences. 

Analysis of an English text is based on a syntactic analysis of the 
sentence. The grammatical function of a word is determined by morphological 
and syntactic analysis according to rules grouped by the parts of speech. 

The glossary contains more than 500 words in general use and specialized 
technical terms, 

56, AUTOMATIZATION OP TRANSLATION PROGRAMMING 


0, S, Kulagina (Moscow) 

1, Long, tedious process of constructing translation programs causes 
need to automatize programming. Requirements of translation programs and 
inpossibility of using existing programming programs. Formulation of 
problem of automatizing translation programming, 

2, Breakdown of translation algorithms into operators. Types of 
operators and functions of each. Parameters of operators, 

3, Preparation of translation algorithm for translations recording 
of algorithm in the form of sequence of simple rules, transition from this 
recording to operator, automatic construction of translation program ac- 
cording to operator recording of algorithm by means of compiling program, 

4 0 Compiling program, its structure. Some features of structure of 
programs obtained by the method described. 
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57 o A FRENCH-RPSS IAN TRANSLATION ALGORITHM 

0« So Kulagina (Mosoow) 

(1) Formulation of problems; translation of mathematical texts 0 De- 
mands for quality in translation* cases requiring editingo 

(2) Structure of glossary for machine translationifeatureso Glossary 

information and purpose. Glossary of phrases o * 

(5) Principles in constructing translation algorithmo Structure of 
algorithm and order of operation, Word look-up in glossary. Treatment 
of phrases o Differentiation of homonyms and analysis of polysemants, order 
of operation of rules for differentiating homonyms. Analysis of French 
sentence* problems. Sequence of handling parts of speech during analysis. 
Character of information obtained through analysis. Change of word order 
in translation. Synthesis of Russian sentences order of operation of 
synthesising rules and how they differ from analysing rules, 

(4) Supplementing and correcting algorithm on the basis of experimental 
translations (greater precision in rules for differentiating homonyms, 
change in handling of adjectives * separation of morphological from syntactic 
analysis), 

58, DETERMINATION OF SYNTACTIC CONNECTIONS FOR FORMULAS 

mmmnsmmmrTsm 


Mo Mo tengleben (Moscow) 

lo We call "formulas* all text elements not found in a mechanical 
glossary in processing a text ( surname * mathematical formulas, foreign 
references, neologisms, etc,), "Formulas®, like words to be translated, 
require the ascertaining of syntactic connections in the text to be 
analyzed, i 0 e,, the identification of formulas that form part of one of 
the previously given syntagmas, 

2, The analysis of a "formula" is broken down into 2 parts* 

(A) testing the formula proper for the presence of any word- 
changing suffixes, the sequence of tests being determined by 
frequency of the cases o 

(B) analysis of its environment (words and punctuation marks). 
This begins only after all the "formulas* contained in the 
given segment of text have passed through part A, 

3, The following order of ascertaining the possible syntactio 
connections for "formulas* is advisable in that it eliminates the possi- 
bility of establishing false syntagmas* 
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(a) the formula acts as an adjective for a substantive standing 
on the rights 

(b) the formula is a name with a substantive standing on the left; 

(o) the formula forma part of a prepositional phrase; 

(d) the formula forms part of a syntagma with an adjective requiring 
the dative case (RAVNYI^equal/, KRATNYI ^/multiple/) S 

(e) the formula forms part of a syntagma with an adjeotive in the 
comparative degree replacing a substantive in the genitive case; 

(f) the formula replaces a governing substantive in an "adjective 4- 
substantive" syntagmas 

(g) the formula acts as a predicative ooobination 0 

These last ar*e used to check various syntagmas with a verb; the function 
of a formula with a verb is chiefly determined by its position on the right 
or left of the formulae not by the form of the verbo 

4, Since the analysis of "formulas" is a basic part of the routine 
developed for the language as a whole, it will be performed piecemeal at 
various stages of the total analysis 0 

69 o ELIMINATION OF 1DRFHOI0GICAL AND SYNTACTIC 

E0M3W tb mmm mtm ms s 

Mo Mo Langleben and Ye. Vo Paducheva (Moscow) 

lo Those words in a dictionary of stems that oannot be identified as 
a fixed part of speech* i.e. "attempt" (verb* noun)* "oool" (adjeotive, 
verb)* and "further" (adverb* adjeotive)* etc 0 * are handled as follows* 

If a word oan be a noun and a verb or an adjeotive and a verb, it is 
inserted in a dictionary of substantives or verbs* respectively. Those 
word-changing suffixes that can readily identify one part of speeoh to 
which a word belongs («e4, ^ing * but not =s) are listed in a table of word- 
building suffixes * i 0 eTTf the word has one of these endings, the part of 
speeoh will be revealed after morphological analysis o However, homonymio 
stems do not require any changes in the analysis routine provided for the 
other words. (This method is based on a suggestion by A. I. Smimitskii 
who defined conversion as word building by means of paradigms). 

2. If the part of speech cannot be readily determined by morphological 
analysis of (zero ending in word stems) "They attempt", *tfhe attempt", 
homonymio ending— "he attempts"* "£he attempts— or the parts of speech which 
have no word-changing forms are homucymie— "further* (adjeotive, adverb)— 


<=• 69 » 

Approved For Release 2000/08/24 : CIA-RDP68-00069A0001 00200007-9 



Approved For ReleasrtboO/08/24 : CIA-RDP68-00069A00010ffl&0007-9 


"the word is assigned several syntactic functions corresponding to the 
possibilities of the word to enter a syntagma as a substantives, verb, etc* 
The possible functions are examined in a definite order and a syntagma is 
established for the given word , depending on whether certain words are 
present in the sentence? thereafter all the remaining functions listed are 
dropped out except that for which the syntagma was found,, 

3, Similarly, homonomy in -°ing form®, =ad forms, etc* is eliminated 
by successive tests for the presence of certain syntagmas in the sentence. 

60. THE SUFERFLUOUSHESS OF RUSSIAN ADJECTIVE INFLECTION 

No N. Lsont 9 yeva and 0. No Vavilova (Moscow) 

lo In machine translation from Russian the procedures for handling 
the inflection of adjectives are quite cumbersome. The maohine has to 
perform a double tasks first, to investigate the inflection of the ad- 
jective , then to search for the substantive with which the adjective 
agrees. 

There is an easier way of relating an adjective to the substantive 
with which it agrees, a way that ignores inflection in most oases * 

2. When a Russian text is analysed, it usually turns out that adjective 
inflection is superfluous as far as translation in concerned. It merely 
indicates the agreement of the given adjective with a certain substantive. 

3. An adjective may be related to the substantive with whieh it agrees 
without analysis of its inflection by using the adjective's position in the 
sentence . 

An adjective - attributive most frequently occupies with respect to 
the substantive with which it agrees a definite positions it stands either 
before this substantive or after it, following a comma. 

Accordingly, it is possible to formulate two rough rules for relating 
an adjective to its substantives 

(a) Relate the adjective to the nearest substantive on the right? 

(b) If there is no substantive on the right, relate the adjective 
to a substantive that is followed by a comma 0 

4 0 However, relating an adjective to a substantive in accordance with 
these rules alone may turn out to be incorrect. 
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Therefore , a number of individual tests must be perform 9 d before 
finally deciding the problem of relating an adjectives is the adjective 
part of the nominal constituent of the predicate, is it included in a for- 
mul a„ d oes it govern the following noun with or without a preposition 
(VYZVEDENNYI IZ FORMOLY Reduced from the formula? , RAVNYI NULYU /equal to 

ser ° 7) 0 

So After these cheeks the machine either relates the adjective to the 
substantive without regard to its inflection or, if it cannot dispense with 
it, analyses the inflection of the adjective . 

6, An analysis of mathematical texts shows that without investigation 
of inflection it is possible to relate more vfchan 85^ of all adjectives to 
the appropriate substantives. The remaining 10-15?? of the adjectives re- 
quires an analysis of the inflections. 

7. In calculating the number of adjectives we excluded short ad- 
jectives, the relative KDTORYI /which/, cases where the adjective is part 
of a formula, oases of ellipsis (the adjective is present, but not the 
noun with which it agrees, e.g. OTLICHAYETSYA OT HAS SMDTRENNYKH B ETOM 
PARAGRAFE /it differs from the (things) considered in this paragraph/. 

8 0 The practicability of a method to ascertain the possibility of 
ignoring adjective inflection has still not been proved. This will re- 
quire further work on texts as well as more experience with machine 
translation, taking cognisance of technical difficulties. 

Nevertheless, the suggested routine for relating an adjective to its 
substantive by position criteria will retain Its value, even if the 
necessity for investigating the inflections of all adjectives is demon- 
strated, since inflection is merely one of the factors that control the 
correct relating of an adjective to its substantive by position criteria. 

61. AN ALGORITHM OF MACHINE TRANSLATION FROM 
ENGLISH INTO RUSSIAN — 

T. N. Moloshnaya (Moscow) 

I. (1) Different possibilities for formalising linguistic data in 
different languages. 

(2) Advantages of a structural-syntactic analysis of English. 

II. (l) Classification of English and Russian words according to 
formal criteria. 

(2) Grammatical configurations constructed from isolated 
classes of words. 


IIXo Analysis of English sentence structure according to grammatical 
configurations . 
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(1) Replacement of grammatical configuration by its chief member, 

(2) Sequence of ascertaining grammatical configurations in the sentence 
to be analyzed, 

17. Synthesis of Russian sentence structure according to grammatical 
configurations. 

(l) Substitution of the English grammatical configuration used by the 
corresponding Russian configuration, 

(B) Morphological formation of Russian sentence structure. 

(3) Definition of grammatical forms of words in the Russian sentence, 

Yo Elimination of lexico-grasmatioal homonoay in the English sentence 
on the basis ofg 


(1) morphological data, 

(2) syntactic data 


YI* Tests of machine analysis of English sentence structure. 


62- A DEVICE FOR THE READING OF ORDINARY 

rannH) ggjjjggr tx furng 


R, S. Muratov (Sverdlovsk) 


1. Conversion of the graphic form of letters in a printed text into 
electrical signals is achieved by breaking down the group of photosensitive 
elements as they move along the line of text. 

2. lleotrical impulses generated when photosensitive elements are 
blacked out switch on electronic relays which, in turn, switch on a taotile 
or phonio signalling instrument. 

3. The form of the signals (of successive formation of elementary 
signals corresponding to each zone of disintegration) expresses the graphic 
peculiarities of the letters and other marks in the text. 

4. Correot reading of the signals requires preliminary instruction 
by a reader. 
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63 ° analysis of fun ctuat ion marks during machine 

TRANSLATION FROM"RU SSuf""~"~~-~^ ~ 


To N„ Nikolayeva (Moscow) 

3.0 The purpose of this operation is to obtain the distinguishing 
features of punctuation marks during machine translation, 

. ... In t f“ a “ sli * ti ° n from Russian each word in the sentence must receive 
definite morphological and syntactic signs. The required signs are obtained 
in different ways for each part of speech. In particular* in order to 
determine the case and number of substantives it is necessary to know the 
correlative position of the parts of speech within the limits of the closed 
»«ten« AuffiTOOTO PSEDL0ZHENiq7„ Kcrwvor , ooat Eus.Un .entenoe. ar. 
ompi cated by parenthetical and set-off /T.« 0 by commas-0B0S0BLENNY]£7 
constructions * subordinate clauses* etc. 

Hence* to obtain the precise grammatical signs it is necessary to break 
down a complex sentence into simpler components* dividing the main clause 

from the subordinate clauses and separating the set-off and parenthetical 
pfir&sos © 


Thus* the final goal of the analysis of punctuation marks is to* 

* ^ . ( a ) separate simple clauses from the body of the oomplex sentence* 

to find the boundaries of the simple clause within the sentence j 

(b) separate similar members of the clauses 

.. 4 J 4 1°) hel P th « subsequent elucidation of interrelations between 

the individual parts of the punctuated oomplex sentences 

(d) determine a group of similar members. 


2. ^sio/The analysis is made within a single complex sentence. 
Accordingly* "simple" and w multipurpose w punctuation marks are dis- 
tinguished. The simple ones (period* exclamation point, question mark, 
and dots) serve as the boundaries of a complex sentence. 

Multipurpose marks (comma* dash* and colon) unite simple olauses into 
a complex clause, introduce subordinate olauses, and separate parenthetical 
and set-off constructions. 


We are devoting the bulk of our attention to the multipurpose marks. 
r the ? serve, according to Prof * A. B. Shapiro's terminology, 

to divide or to separate . We are also paying special attention to the 
pro lem or distinguishing between single and non«s ingle punctuation marks 
(e.g. those used at the end of a set-off phrase and the beginning of a 
subordinate clause etc.). 
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As a result of the analysis, all the multipurpose punctuation 
marks receive one of the following signs* 

(1) parenthetical (i.8» separating parenthetical words and phrases) 

(2) setting off (separating participial and verbal-adverb phrases 
as well as set-off attributives and appositives)s 

(3) similar-simple (dividing similar members of a sentence) i 

(4) similar -complex (demarcating the parts of a compound sentence )i 

(5) dissimilar (i*e» introducing a subordinate clause)* 

4„ Separation of the simple clauses ©occurs within the limits of the 
coup lex whole according to our data* 

The entire process of analyzing punctuation marks can be divided into 
3 stages* 

(1) Separation of the purely parenthetical constructions takes 
place in the analysis glossary where the words that may be 
used parenthetically or that are a basic part of a parenthet- 
ical phrase undergo special analysis, after which the punc- 
tuation marks that separate them receive an appropriate in- 
formation sign 0 

(2) Processing of punctuation marks by the "Punctuation Marks'* 

routine, where the basic analysis of all the punctuation 
marks takes place* 

(3) Breakdown of the sentence into its constituent parts— 
separation of parenthetical and set-off constructions, 
dividing of simple clauses, etc. Here the occurrence of a 
"non-single" mark is extremely important* This routine also 

provides for insertion of a sentence-demarcating punctuation 
mark where necessary* 

6* The "Analysis of Punctuation Marks® routine consists of several 
parts, each of which corresponds roughly to a given punctuation marko 

Within each part several checks are made on a number of individual 
factors that determine the funotion of the multipurpose punctuation marks* 

These factors include the presence of verbs with the sign"IP" (LICHNAYA 
FORMA) /personal form7on both sides of a given mark (or on one side of it), 
the presence of verbs with the sign "HE LICHNAYA FORMA" /non-personal form/ , 
the place of a substantive with the sign FS ("FORMA SIDVARHAYA") dictionary 
form/ in respect to the given mark, the separation of words belonging to a 
given lexical group, etc* 
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6<> As a result of our investigation, all the punctuation marks are 
provided with the requisite distinguishing features and the analysis is 
performed accordingly within the separated simple units. 


64 0 SOME PROBLEMS 
COMPlM SENTENCES 


I CONNECTED WITH THE ANALYSIS OF 
AND cuusES~f£fH SnaiAR MEMBERS 


Ye. V 0 Paduoheva (Moscow) 


1. The following problems must be solved in connection with the 
syntactic analysis of complex sentences and clauses with similar members* 

(a) To distinguish between a syntagma with similar members and 
clause coordination (the difficulties in solving this problem 
are explained by the fact that most of the co-ordinating con- 
junctions (UjTXLJL, no) /&nd s or, bu^ may connect both similar 
members of clauses and entire clauses and therefore they can- 
not serve as a trustworthy sign either of olause boundary or 
of syntagma with co-ordinating connective /SOCHINITEL'NOI 

SVAZ’TT^b 

(b) To separate words interlinked by a co-ordinating connective, 
having divided them from the words governed by them. 


2. For this purpose we propose the following method of analyzing sen- 
tences with oo-ordinating conjunctions (only 2-member combinations are con- 
sidered for the time being)* The sentence is cut up into "chunks’*, the 
limits of which are oo-ordinating conjunctions, and the syntactic analysis 
is performed within the chunks j if after completion of syntactic analysis 
within the ohunk no words remain without a governor* it means that the con- 
junction oonneots two clauses % if, however, such words remain, it mans that 
the sentenoe Contains similar members. Words laoklng a governor are, for 
the most part, members of a co-ordinating syntagma. 

3o When words are combined into a coordinating syntagma, the oonoept 
of "sameness of form" ^A.VNOOPORMLENNO ST is used. "Sameness of form" 
is the coincidence of several of their morphological and syntactic signs. 

The same form is sought beyond the chunk for a word that lacks a governor 
within the chunk and a coordinating syntagma is thereby established. 

(This must be refined somewhat due to the possible absence of agreement 
in number for words with the ohunk, etc.). 

4 ^ This method of analysis is feasible for Russian because a word 
normally oontains all the information regarding the possible syntactic 
connections for it ( with some exceptions,— compare, e.g«, the hom- 
onomy of oases, whion may make the syntactic function of a word in the 
ohunk indefinite). This method is impracticable in English (e.g. the 
syntaotic functions of a substantive are determined wholly by its position 
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after a transitive verb, before another substantive* etc,j therefore, 
superfluous "subjects" would appear after the division into chunks is made*, 

However, some of the difficulties mentioned for Russian disappear in 
English during the analysis of a sentence with co-ordinating conjunctions 
due to the rigid word order and preferential position of the governed word 
after the governor, English syntagmas with co-ordinating connectives are 
determined at the same time as the others during the course of syntactic 
analysis . 

S 0 Some methods of fixing the boundaries of a simple olause inside 
a complex clause are indicated, 

660 MACHINE TRANSLATION OF COMPOUND NOUNS FROM 

RTO CT 

V, 7. Parshin (Moscow) 

l e The extensive use of compounds in Germane particularly in Boientifio 
and technical literature, has made it necessary to work out universal rules 
for their translation. 

Formulation of such rules makes possible a significant reduction in 
the size of the dictionary and the translation of compounds, provided that 
idle components are known. 

Universal nllea for the translation of compounds are deduced from a 
structural-semantic analysis of the constituent words. Determination of 
semantic connections between them ensures an adequate translation. 

The author's investigations do not pretend to be a complete and 
final solution to the problem of translating compounds , They are merely 
an initial, empirioal attempt at working out the basic principles and 
methods that would permit of a more or less successful translation at the 
first stage, 

2, The existence of the following types of connections between the 
stems of compounds has been demonstrated by an analysis of concrete 
linguistio material (individual original works on mathematics and a 
German-Russian polytechnioal dictionary)* 

1, Relation of the sum to the constituents, 

2o Relation of a part to the whole, 

5, Object or subject of an action to the action, 

4. Object of the bearer of a quality to the quality. 
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5« Object of a determiner to the thing determined,. 

Translation of the first component of compound words, the internal 
connections of whose components relate to the first four types, is 
effected by produoing a Russian equivalent in the genitive oase« 

If the last type of connection is present, the first component is 
translated in two ways* by a adjective and the production of a Russian 
equivalent in the genitive case* 

Polysomia causes a certain type of connection for each meaning of the 
word, Therefore, a semantic analysis of the components is necessary to 
differentiate the types of relations between the constituent elements. 

Differentiating the relations of a part to the whole and the relation 
of a determiner to the thing determined is the most difficult of all, 

3, A special case is the translation of compounds consisting of three 
components , It is important here to establish the ©o -subordination of 
determining stems to the determined stem, which is done by subjecting them 
bo analysis in pairs, 

Three-oomponent words are translated in accordance with the rules for 
translating two-stem words, 

4, Compounds of the input text are broken down into constituent 
ste m s by the superposition of stems included in the dictionary^ taking 
into account connecting consonants and rejected endings, 

5, The principles and methods of translating German compounds 

into Russian, as set forth above, ©an serve as the basis for a definitive, 
detailed solution of one of the most complicated lexicographical problems 
in German, 


66, PROPER NOUNS IS MACHINE TRANSLATION 

A, To Superanskaya (Moscow) 

lo Proper nouns are unavoidably present in every scientific test, 

2, In the present state of development the machine translates & 
text, but leaves proper nouns just the way they are, printing them in 
Latin letters, 

3, Since the number of proper nouns increases as one proceeds from 
selective to continuous translation, the question of the desirability of 
automatising the process of transcribing proper nouns arises. 
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4 0 Proper nouns ar® not always written,, read, and pronounced in 
all languages in accordance with the rules for common nouns, 

5, Proper nouns are international. The same nouns are encountered 

among peoples of different nationality© People move from country to 
country and publish their papers in different countries in different 
languages. That is the reason for the difficulty in determining the 
nationality of a noun and, accordingly the rules by which it should be 
transcribed, / 

6, There is much inconsistency in the current transcription of 
nouns. The need to unify the transcription and eliminate the lack of 
uniformity is long overdue, 

7, Due to the limitless memory potentialities of the machine and the 
difficulty of mechanical analytical transcription, it is more efficient 

to store proper nouns as a whole in the machine 6 s memory. Consequently, 
if it encountered such a noun in a text, the machine would locate it in 
the glossary and deliver the answer (simple or in several variants, de«=> 
pending on the linguistic origin of the noun and on existing traditions). 
This would help to make transcription uniform® and it could be accompanied 
by a printed glossary to match, 

67, WORK ON A BURMESE -RUSS IAN J ALGORITHM OF 

“ machine Translation ~ “ 

0, A, Timofeyeva (Leningrad) 

1, The syllabic nature of Burmese writing requires the elaboration 
of a special program by which an electrical reading device can handle a 
Burmese text, 

2, We are compelled to restrict the algorithm to the literary form 
of Burmese speech owing to the sharp divergences between the written and 
contemporary spoken languages, 

3o A highly developed word— building root structure that crosses 
with a form-building root structure makes it necessary to have a special 
word-building program, ^fche purpose of which is to separate lexical from 
morphological phenomena, 

4, The development of agglutination and the rudiments of internal 
inflection require the construction of a complicated morphological pro- 
gram for handling the abundant and varied grammatical information con- 
tained in the Burmese word, 

6, The absence of a rigid order for nominal members of the Burmese 
sentence complicates the syntactic program, which cannot be effected 
without the preliminary operation of the morphological program. 
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68 0 WORK ON AN ARABIC-RUSSIAN ALGORITHM OF 
MACHINE tRANSIA^ION 


0. Bo Frolova (Leningrad) 

Io Items from newspapers are used as texts in machine translation from 
Arabic to Russian. 

II o The main principles in working on an Arabic-Russian algorithm of 
machine translation, as contrasted with those of traditional grammar, 
are as follows g 

(a) Only the written form of the language with the infixes consonants 
and long vowels is considered, whereas all the existing grammars take into 
account the short vowels, which are not normally noted in writing. For 
Arabic two algorithms, differing in principle, are necessary? one for the 
spoken language, the other for the written* -the two variants are not re- 
ducible to each other. 

(b) The traditional dictionary arranged by roots is replaced by a 
dictionary arranged by stems* 

(o) For convenience in transliterating Arabic letters into Russian 
letters, the latter are used with no additional signs of any kind. 

III. The programs making up the algorithm are as follows? (1) stem- 
stripping (2) address (3) morphological (4) syntactic (5) dictionary of 
stems (6) table of prepositions (7) glossary of idioms and phrases (8) 
program for distinguishing homonyms. 

TV. Work on the stem-stripping program? 

(a) Initial variations of this program provided .for cutting off the 
stems, prefixes, and suffixes j the glossary increase^ considerably, how- 
ever, due to pseudostems. 

(b) An important factor in simplifying this program was the idea of 

a reject ^TKAZ^JOl/ glossary which was later developed into the idea of an 
address ' used in other algorithms too. 

(c) The stem-stripping program includes the following rules? 

(l) Out of the 28 letters of the Arabic alphabet 10 letters may 
be joined as non-radicals to the beginning of a word? these 
are certain conjunctions and prepositions, the definite 
article, and verbal prefixes. 
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(2) In the Case of words that do not contain initial non-radical 
letters it is necessary to refer at once to the address? 
endings and suffixes are automatically stripped upon com- 
paring the words with the stems found in the address* 

(3) Some of these initial non-radical letters , which when cut 

off reveal an insignificant number of pseudostems , are first 
transferred to the end of the words and converted into suf- 
fixes are kept apart? the words are then sought in the 

address o 

(4) Words with remaining initial non-radioal letters , which if 
Out off would result in a large number of pseudostems , are 
first checked in the address? if they are not found there, 
the non-radicals are transferred to the end of the words, 
and the words are again looked up in the address » Checking 
for their presence in the address is not equivalent to ex- 
tracting from the address all the information relating to 
the stem* 

69* EXPERIMENTAL TRANSLATIONS FROM FRENCH INTO RUSSIAN 

G„ V. Chekova (Moscow) 

Devising of algorithms for translation from French to Russian* 

Sequence of operations for translation programs 0 Changes in programs 
and coding of glossary on the basis of experimental translations produced 
by the machine o 

Utilisation of scales in translation programs* 

Programming characteristics? scope of programs and glossary? operations 
utilized in translation programs? numerical characteristics of translation 
programs * 

Basic demands made of a special translation machine* 

Examples of translations produced by the STREIA machine in 1957-1968 * 
70* ESTABLISHMENT OF SYNTACTIC CUES FOR 

— m t mxmar mm 1 

le Ho Shelimova (Moscow) 

lo The object in making a syntactic analysis of prepositional phrases 
consisting of either a preposition and substantive standing to the right of 
it or a preposition and pronoun immediately adjacent to it on the right is 
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■fco include thes® pre po s itionai phrase in syntagma 4 It is necessary, 
therefore , to find a word in the sentence with which the prepositional 
phrase forms a syntagma <, 

2 0 There are no complications in drawing Up the rules for the 
formal analysis of prepositional phrases if a word that belongs to a 
class of words capable of forming a syntagma with the prepositional 
phrase is found immediately tc the left of the prepositional phrase „ The 
only exception is a case where a noun stands next to the prepositional 
phrase 0 Thus , if there is any verbal form - infinitive, participle (short 
or full),, verbal adverb, or adjective (full or short) ~ or special group 
of invariable words on the left of the prepositional phrase t the prepo- 
sitional phrase forms a syntagma with this particular wordo 

3 0 If on the left of the prepositional phrase is a word that belongs 
to a class of words with which the prepositional phrase does not generally 
form a syntagma (pronouns , adverbs, particles, conjunctions) or the prep- 
ositional phrase stands at the very beginning of the sentence, then the 
word with which the prepositional phrase forms a syntagma must be searched 
for in the following orders 

(a) Search to the left for the next word with which the prepo- 
sitional phrase my become a syntagma, excluding a noun, i 0 e 0 search for 
any form of verb, adjective or special kind of invariable word, A prep- 
ositional phrase my unite in a syntagma with several of the classes of 
words listed after it fulfills a series of conditions 0 

(b) Search to the right for the next word belonging to the class 
of verbs (except the full participle and verbal adverb) or a word from 

the special group of invariable words or a short adjective u Actually while 
searching for a word on the right, -with which the prepositional phrase may 
form a syntagma, we are looking for a word in the predicate of the sentence. 

4 0 If a prepositional phrase stands next to a noun (immediately to 
the left of the noun) , the rule for establishing the syntagma constituted 
by this phrase is not general for prepositional phrases with different 
prepositions o 

5o Therefore, any of the following my be significant in determining 
the rules for analyzing prepositional phrases with a number of prepositions* 

(a) The lexical composition of the prepositional phrase itself? 

(b) Does the prepositional phrase have on its left a noun which 
by virtue of its syntactic or lexical properties is such that its connection 
with the prepositional phrase must be regarded as certain? 

(c) Does the sentence have any verbal form that by virtue of 
syntactic or lexical properties must be regarded as necessarily connected 
with a given prepositional phrase? 
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So The s tract-are of the sentence is particularly important in 
establishing the rules of syntactic analysis for prepositional phrases 
with several other prepositions (®.g. t M the prepositional case 

and diva /Fo^). In order to determine the regular syntactic cues for the 
prepoix^ionaT phrases mentioned B it is necessary in certain cases to 
know if the prepositional phrase stands before or after the predicate or 
■which syntagma contains the noun that is followed by the prepositional 
phrase. Sometimes it is iiaportant to know whether or not this noun in 
turn forms a prepositional phrase with certain prepositions (e 0 g. 
v r© 2 ul^‘ba*b® /as a result of/ 5 po&lQ /&£ ©°te«) b©cauB© in. B^oh. ^ 
case a prepositional phrase with dlya or v cannot be related to this 

noun. 

71. CORRELATION BETWEEN 3RD PERSON PERSONAL PRONOUNS 

WfBE"t)UNS FOR WH ICH THEY SUBSTITUTE ' 

A® Lo Shumilina (Moscow) 

1„ In machine translation the 3rd person personal pronouns of one 
language cannot be mechanically substituted for the corresponding pro- 
nouns of another language since gender is not an inherent sign of every 
pronoun, but depends on the gender of the corresponding noun, which is 
accidental as far as they are concerned and specific for the different 
languages . 

2. The following formal data must b© obtained first if the correlation 
between a pronoun and the corresponding substantive is to be established* 

(a) The boundaries of the clauses (no cognisance is taken of the 
differences between the boundaries of clauses within sentences and sentence 
boundaries) | 

(b) The grammatical properties of the substantives and 3rd 
person personal pronouns ( gender s number, case)g 

(c) The syntactic relations and specific syntactic functions 
of the substantive s j 

(d) The order of substantives in the clauses 

(©) Certain sequence® of syntactically related words (e.g. ex- 
panded attributes). 

3. A substantive for which a given pronoun is used must "correspond 
grammatically® to this pronoun. By grammatical correspondence we mean the 
correspondence between substantive and pronoun in number (correspondence 

in number will in several cases differ from the conventional) and gender (in 
the singular). 
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4 0 The way to determine the corresponding ("unknown") substantive 
is 9 for the most part s as follows g 

The search for the grammatically corresponding word is made only to 
the left of the given pronoun (omitting the previously determined elements 
in the clause)* 

A« Within a aero (^clause /"(l)Clauses subject to analysis are 
numbered 8 zero (OJ - a clause within which the given pronoun is found, 
first (l) £ next clause to the left of the zero, second (2) * next olause 
to the left of the first, etc^ 

(a) For prono uns in the nominative case, the only possible un- 
known substantive may be one with a sign of the "grammatical subject" 

(this concept is defined bef orehand ) 0 

(b) For pronouns in other than the nominative case, the unknown 
word is the substantive that is closest to the given pronoun, but with 
certain restrictions (e*g. the unknown Substantive must not form a single 
word combination with the given pronoun, nor must it be the middle word or 
word on the extreme right in a chain of genitive cases , if the word on the 
extreme left satisfies the sign of "grammatical correspondence", etc*) 

B» Within the first, second* **nth clause (The analysis is made 
successively within the 1st 2ndo* 0 nth clause until the word that satisfies 
our requirements is found)* 

For pronouns both in the nominative and in other oases, a word with 
a sign of the "grammatical subject" is considered firsts in the event 
that there is no grammatical correspondence between the pronoun and the 
"grammatical subject® found, we pass on to a word with a sign of the 
"grammatical direct object", then to the substantive that is closest to 
the right boundary of the 1st or nth clause (taking into account the 
various restrictions already determined)* 

5* Similar work in the future may, with appropriate additions 
(animatehess in nouns and other criteria), be significant from the point 
of view of practical stylistics, i 0 e 0 it may create the possibility of 
determining certain purely formal rules for using 3rd person personal 
pronouns on the basis of the laws of the language itself* 

* * * 

= E S D = 


us jprs/dc 

DUPONT 


(M3 83 «=» 


Approved For Release 2000/08/24 : CIA-RDP68-00069A0001 00200007-9 



